From patchwork Thu Mar 20 13:49:01 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 26709 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-yh0-f72.google.com (mail-yh0-f72.google.com [209.85.213.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 15BAF202E0 for ; Thu, 20 Mar 2014 13:50:05 +0000 (UTC) Received: by mail-yh0-f72.google.com with SMTP id f10sf2272970yha.11 for ; Thu, 20 Mar 2014 06:50:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:in-reply-to:references :sender:precedence:list-id:x-original-sender :x-original-authentication-results:mailing-list:list-post:list-help :list-archive:list-unsubscribe; bh=tbO+hCc2rnSu0jLCZZcosLNU0WR+1peIWreKddtc3dM=; b=StsCTfok4AAkkyu2FWFl/MCa2+c19WneHz4RW0snZFwAb+yE86/jtGA5ZuD1U/wPQL vUO8vGmfu+2OwYN7USl3lv27t7ii0Lrx0HP9hKZMrYJIG26zVEVU1ODnpqWWcBZnx9gG JQmbv/g0NsvDINe80uGrR4w03kS0W16H2gQ/4brmET0p7N+6wx2vics0ayrEffjIXsuL 7jIkhOk167LKQvx74rFLgfGV9WjvxnlbApO3Cb8tubobKB/bM5evVkJiScDF4dxCR34u CISvhzoLwqYJjl5/A8OrLbVYqtxVHQci41Pcph8r+i/5mBNwVpC+SAabzE9cZPSHn/yn N0RA== X-Gm-Message-State: ALoCoQk1T6kZapi1JNhgh++pySpiOaIXRRXvKu5r4WlaaWonI+pwI+XRCnipIRT/KSkT3ER2Yeqe X-Received: by 10.224.36.137 with SMTP id t9mr17206056qad.4.1395323405684; Thu, 20 Mar 2014 06:50:05 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.102.184 with SMTP id w53ls241152qge.45.gmail; Thu, 20 Mar 2014 06:50:05 -0700 (PDT) X-Received: by 10.220.47.10 with SMTP id l10mr174521vcf.56.1395323405546; Thu, 20 Mar 2014 06:50:05 -0700 (PDT) Received: from mail-ve0-f182.google.com (mail-ve0-f182.google.com [209.85.128.182]) by mx.google.com with ESMTPS id is3si430624vec.207.2014.03.20.06.50.05 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 20 Mar 2014 06:50:05 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.128.182 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.128.182; Received: by mail-ve0-f182.google.com with SMTP id jw12so964702veb.13 for ; Thu, 20 Mar 2014 06:50:05 -0700 (PDT) X-Received: by 10.220.146.13 with SMTP id f13mr146233vcv.57.1395323405463; Thu, 20 Mar 2014 06:50:05 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.78.9 with SMTP id i9csp389613vck; Thu, 20 Mar 2014 06:50:04 -0700 (PDT) X-Received: by 10.66.20.10 with SMTP id j10mr48665215pae.11.1395323404289; Thu, 20 Mar 2014 06:50:04 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id tk5si1533171pbc.166.2014.03.20.06.50.03; Thu, 20 Mar 2014 06:50:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932845AbaCTNtt (ORCPT + 26 others); Thu, 20 Mar 2014 09:49:49 -0400 Received: from mail-pd0-f176.google.com ([209.85.192.176]:41707 "EHLO mail-pd0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758999AbaCTNtp (ORCPT ); Thu, 20 Mar 2014 09:49:45 -0400 Received: by mail-pd0-f176.google.com with SMTP id r10so946426pdi.21 for ; Thu, 20 Mar 2014 06:49:44 -0700 (PDT) X-Received: by 10.68.161.101 with SMTP id xr5mr3154201pbb.168.1395323383932; Thu, 20 Mar 2014 06:49:43 -0700 (PDT) Received: from localhost ([122.172.246.56]) by mx.google.com with ESMTPSA id f5sm10531463pat.11.2014.03.20.06.49.39 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 20 Mar 2014 06:49:43 -0700 (PDT) From: Viresh Kumar To: tglx@linutronix.de, fweisbec@gmail.com, peterz@infradead.org, mingo@kernel.org, tj@kernel.org, lizefan@huawei.com Cc: linaro-kernel@lists.linaro.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Viresh Kumar Subject: [RFC 4/4] cpuset: Add cpusets.quiesce option Date: Thu, 20 Mar 2014 19:19:01 +0530 Message-Id: X-Mailer: git-send-email 1.7.12.rc2.18.g61b472e In-Reply-To: References: In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: viresh.kumar@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.128.182 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , For networking applications platforms need to provide one CPU per each user space data plane thread. These CPUs should not be interrupted by kernel at all unless userspace has requested for some syscalls. Currently, there are background kernel activities that are running on almost every CPU, like: timers/hrtimers/watchdogs/etc, and these are required to be migrated to other CPUs. To achieve that, this patch adds another option to cpusets, i.e. 'quiesce'. Writing '1' on this file would migrate these unbound/unpinned timers/workqueues away from the CPUs of the cpuset in question. Writing '0' has no effect and this file can't be read from userspace as we aren't maintaining a state here. Currently, only timers are migrated. This would be followed by other kernel infrastructure later. Suggested-by: Peter Zijlstra Signed-off-by: Viresh Kumar --- kernel/cpuset.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 3d54c41..1b79ae6 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -43,10 +43,12 @@ #include #include #include +#include #include #include #include #include +#include #include #include #include @@ -150,6 +152,7 @@ typedef enum { CS_SCHED_LOAD_BALANCE, CS_SPREAD_PAGE, CS_SPREAD_SLAB, + CS_QUIESCE, } cpuset_flagbits_t; /* convenient tests for these bits */ @@ -1208,6 +1211,44 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val) return 0; } +void timer_quiesce_cpu(void *cpu); + +/** + * quiesce_cpuset - Move unbound timers/workqueues away from cpuset.cpus + * @cs: cpuset to be quiesced + * + * For isolating a core with cpusets we require all unbound timers/workqueues to + * move away for isolated core. For simplicity, currently we migrate these to + * the first online CPU which is not part of tick_nohz_full_mask. + * + * Currently we are only migrating timers away. + */ +void quiesce_cpuset(struct cpuset *cs) +{ + int from_cpu, to_cpu; + cpumask_t cpumask; + + cpumask_andnot(&cpumask, cpu_online_mask, cs->cpus_allowed); + +#ifdef CONFIG_NO_HZ_FULL + cpumask_andnot(&cpumask, &cpumask, tick_nohz_full_mask); +#endif + + if (cpumask_empty(&cpumask)) { + pr_err("%s: Couldn't find a CPU to migrate to\n", __func__); + return; + } + + to_cpu = cpumask_first(&cpumask); + + for_each_cpu(from_cpu, cs->cpus_allowed) { + pr_debug("%s: Migrating from CPU:%d to CPU:%d\n", __func__, + from_cpu, to_cpu); + smp_call_function_single(to_cpu, timer_quiesce_cpu, + (void *)from_cpu, true); + } +} + /** * update_tasks_flags - update the spread flags of tasks in the cpuset. * @cs: the cpuset in which each task's spread flags needs to be changed @@ -1244,6 +1285,11 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, int spread_flag_changed; int err; + if (bit == CS_QUIESCE && turning_on) { + quiesce_cpuset(cs); + return 0; + } + trialcs = alloc_trial_cpuset(cs); if (!trialcs) return -ENOMEM; @@ -1526,6 +1572,7 @@ typedef enum { FILE_MEMORY_PRESSURE, FILE_SPREAD_PAGE, FILE_SPREAD_SLAB, + FILE_CPU_QUIESCE, } cpuset_filetype_t; static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, @@ -1569,6 +1616,9 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, case FILE_SPREAD_SLAB: retval = update_flag(CS_SPREAD_SLAB, cs, val); break; + case FILE_CPU_QUIESCE: + retval = update_flag(CS_QUIESCE, cs, val); + break; default: retval = -EINVAL; break; @@ -1837,6 +1887,12 @@ static struct cftype files[] = { .private = FILE_MEMORY_PRESSURE_ENABLED, }, + { + .name = "quiesce", + .write_u64 = cpuset_write_u64, + .private = FILE_CPU_QUIESCE, + }, + { } /* terminate */ };