From patchwork Sun Sep 5 14:35:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AEFEC433F5 for ; Sun, 5 Sep 2021 14:36:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72C6C600D0 for ; Sun, 5 Sep 2021 14:36:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234859AbhIEOhO (ORCPT ); Sun, 5 Sep 2021 10:37:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233908AbhIEOhN (ORCPT ); Sun, 5 Sep 2021 10:37:13 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9AEFC061575; Sun, 5 Sep 2021 07:36:10 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 8so4051063pga.7; Sun, 05 Sep 2021 07:36:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5QrSQxP5jUcv60OLK9iTQu26ROLgapGunPMgJUevpeY=; b=RrNPksAAUhfNIqIivBW05dCf+n+41aCGjIcRDnJyBzShjMh3rUXyYe8x8rkrJU+3nh dn+MMI1/VvOnI/T6SXoLO5xsdFqnOiSMMZu9BciKd+PEwEgY8VUYYMqYyotRN0+MUvqA DgyPYdlPIJ1DQPWAkxrUyG/gl6avfnX9e9xOFmzIvDN7g2vJijnLpdwIKo0ad4rTDWoC PoMQo1Yp87ZpcJERxAwp+9vgq1ttUxJ6Sr03qlrRVc/Le1NPOfh2MilS+wGq4Qzekx1Z 1ys1GwJ39xKzftx/xO4sFfYyq5ZTGciqUcI9HyAjwJaPG0KYBaJc++ZsPRNeSRMstkex VoqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5QrSQxP5jUcv60OLK9iTQu26ROLgapGunPMgJUevpeY=; b=U/73xwc7t5rgB/GwEBFqY/lVvGlWvM50L3mvf8wtJUJEwsWaGFNAoSKmazZtT7BRMc Ez+WE1n1Nc8PWVL7rdpzS0IBsHlgiv6+y6ID1wHboaE8a33y1l9tmfLzQKtbzTF8ayJB 3mRkM4jElpgUuGthXQGZi/7aMFkIjM9micPJzamBR6nJzdcNtKQKxrI9FyHzlbDqTCEq o+bFwCPzY03K9iT/Ldm4+3uBBB9dwZgcwI+I9ITySgBXK6SR6Vn+d0G6wzZ+2MKHxGEQ LcyjSIMB9N5y4soWqGgYqICeN3ZZqEc0yCr25I6DwTHSLw/5VtY0viy0mxrGrom7aD5A RWmg== X-Gm-Message-State: AOAM530ruZFFPS6HQHxRzNRtkZKPdqhuHIsOoNBuqVXfpw0HiGgUPUX6 qFVSxGVXSMImu9czZl0OKf4= X-Google-Smtp-Source: ABdhPJzw2B0xFcphSU3MrDx++aL+RxW06j2wax+tPSOwtQf/0w+ReyIvV/QdlHxf9xdhLfUgXBtQsA== X-Received: by 2002:a65:63d0:: with SMTP id n16mr7909918pgv.432.1630852570317; Sun, 05 Sep 2021 07:36:10 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:09 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao Subject: [PATCH v4 1/8] sched, fair: use __schedstat_set() in set_next_entity() Date: Sun, 5 Sep 2021 14:35:40 +0000 Message-Id: <20210905143547.4668-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org schedstat_enabled() has been already checked, so we can use __schedstat_set() directly. Signed-off-by: Yafang Shao Acked-by: Mel Gorman Cc: Alison Chaiken --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 25f056d87587..1b92eec48745 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4502,7 +4502,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) */ if (schedstat_enabled() && rq_of(cfs_rq)->cfs.load.weight >= 2*se->load.weight) { - schedstat_set(se->statistics.slice_max, + __schedstat_set(se->statistics.slice_max, max((u64)schedstat_val(se->statistics.slice_max), se->sum_exec_runtime - se->prev_sum_exec_runtime)); } From patchwork Sun Sep 5 14:35:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507296 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39EBFC433F5 for ; Sun, 5 Sep 2021 14:36:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 13FFB60EB7 for ; Sun, 5 Sep 2021 14:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235218AbhIEOhS (ORCPT ); Sun, 5 Sep 2021 10:37:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233908AbhIEOhR (ORCPT ); Sun, 5 Sep 2021 10:37:17 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8534C061575; Sun, 5 Sep 2021 07:36:14 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id e7so2391670plh.8; Sun, 05 Sep 2021 07:36:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GWs5axJSLSgSGfu2TL0CU09SDg6jjtS15dTruMNZZ+A=; b=OVxjKbMYGllXAmqUR0O7ssgilRTJ+/UZCve6QFI2dLgNhOnCTAd9wzmsSz9Pva8EMX kqqHucO31b73/EW6XvL3mkcpCB+vBD+gAvL3mzOU5p29q1ViYRDNz7KrD4ukcmHYJVbI axDnXaRGAarbKhb1H7f+tCEd3xBGMSjffcwCz1vkYwh1kG6oy6k/evVxsIdmTQbJU+ll gmupDVLShxWr7VDIxCJ1Ou3hFt1W9kT+AXq39pxSRsilgxO/pLF1BKaMgJfGJ1JpeV0m icWpKmlN+X1gubkhJ6AoZDY+DPtKulK0M9nrOOy1JLcOWOe5nY4yQf+qdTv4BkDupRmM xuXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GWs5axJSLSgSGfu2TL0CU09SDg6jjtS15dTruMNZZ+A=; b=EGu2QhN/YYrr9Mg/ketSz0YO311gah7SkT1vCKQMXcWHWVk/HfAI04ikxsBG0EsiCg CETOyOlViOZb8Wc6A5h623RpP212v/RbtVS5LKOqQFZ2oLYGHbje0QwKKGhoh1hK47X3 vgDcO96VhsgP+ujNfliuN1gRJ5IUeiWQa0IDWk/wEYq56SMczpCAdjcMEDH/NcDhpi+d 4U5Z4b2JjY6vfwJaTIPBREgTTgzRkWnkwEItk1JpE99nWsI6iipcukEHodTBsbAbR01v Drc75mdJWvmL/kOHUTXUTVOPb4eGjXdYUjpEBHvupv/fiKD7gEQSnfENYKj8/BZIgXwz qSoA== X-Gm-Message-State: AOAM530aDztVAW2LA4iCTQmoKYmNxlV41I55x1OwLaLCVxU+kmeOOlhM v70bbhKEy/5lVapzxGeIA8oHb+wHPSMX1dwx X-Google-Smtp-Source: ABdhPJzqBE/hMFs0PP6N+VZ8ZvcpXRE1FTvB01Dn9PmubGpb27y33ryeE1i8Mi9ChmVSa2gv9IhtOQ== X-Received: by 2002:a17:902:e8d1:b0:138:b1e2:9818 with SMTP id v17-20020a170902e8d100b00138b1e29818mr6963910plg.35.1630852574361; Sun, 05 Sep 2021 07:36:14 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:14 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao , kernel test robot Subject: [PATCH v4 2/8] sched: make struct sched_statistics independent of fair sched class Date: Sun, 5 Sep 2021 14:35:41 +0000 Message-Id: <20210905143547.4668-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org If we want to use the schedstats facility to trace other sched classes, we should make it independent of fair sched class. The struct sched_statistics is the schedular statistics of a task_struct or a task_group. So we can move it into struct task_struct and struct task_group to achieve the goal. After the patch, schestats are orgnized as follows, struct task_struct { ... struct sched_entity se; struct sched_rt_entity rt; struct sched_dl_entity dl; ... struct sched_statistics stats; ... }; Regarding the task group, schedstats is only supported for fair group sched, and a new struct sched_entity_stats is introduced, suggested by Peter - struct sched_entity_stats { struct sched_entity se; struct sched_statistics stats; } __no_randomize_layout; Then with the se in a task_group, we can easily get the stats. The sched_statistics members may be frequently modified when schedstats is enabled, in order to avoid impacting on random data which may in the same cacheline with them, the struct sched_statistics is defined as cacheline aligned. As this patch changes the core struct of scheduler, so I verified the performance it may impact on the scheduler with 'perf bench sched pipe', suggested by Mel. Below is the result, in which all the values are in usecs/op. Before After kernel.sched_schedstats=0 5.2~5.4 5.2~5.4 kernel.sched_schedstats=1 5.3~5.5 5.3~5.5 [These data is a little difference with the earlier version, that is because my old test machine is destroyed so I have to use a new different test machine.] Almost no impact on the sched performance. No functional change. [lkp@intel.com: reported build failure in earlier version] Signed-off-by: Yafang Shao Acked-by: Mel Gorman Cc: kernel test robot Cc: Alison Chaiken --- include/linux/sched.h | 6 +-- kernel/sched/core.c | 25 ++++++----- kernel/sched/deadline.c | 4 +- kernel/sched/debug.c | 91 +++++++++++++++++++++------------------- kernel/sched/fair.c | 87 ++++++++++++++++++++++---------------- kernel/sched/rt.c | 4 +- kernel/sched/stats.h | 19 +++++++++ kernel/sched/stop_task.c | 4 +- 8 files changed, 140 insertions(+), 100 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 1780260f237b..ed4293812ca9 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -521,7 +521,7 @@ struct sched_statistics { u64 nr_wakeups_passive; u64 nr_wakeups_idle; #endif -}; +} ____cacheline_aligned; struct sched_entity { /* For load-balancing: */ @@ -537,8 +537,6 @@ struct sched_entity { u64 nr_migrations; - struct sched_statistics statistics; - #ifdef CONFIG_FAIR_GROUP_SCHED int depth; struct sched_entity *parent; @@ -802,6 +800,8 @@ struct task_struct { struct uclamp_se uclamp[UCLAMP_CNT]; #endif + struct sched_statistics stats; + #ifdef CONFIG_PREEMPT_NOTIFIERS /* List of struct preempt_notifier: */ struct hlist_head preempt_notifiers; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f963c8113375..04c04504249a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3489,11 +3489,11 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags) #ifdef CONFIG_SMP if (cpu == rq->cpu) { __schedstat_inc(rq->ttwu_local); - __schedstat_inc(p->se.statistics.nr_wakeups_local); + __schedstat_inc(p->stats.nr_wakeups_local); } else { struct sched_domain *sd; - __schedstat_inc(p->se.statistics.nr_wakeups_remote); + __schedstat_inc(p->stats.nr_wakeups_remote); rcu_read_lock(); for_each_domain(rq->cpu, sd) { if (cpumask_test_cpu(cpu, sched_domain_span(sd))) { @@ -3505,14 +3505,14 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags) } if (wake_flags & WF_MIGRATED) - __schedstat_inc(p->se.statistics.nr_wakeups_migrate); + __schedstat_inc(p->stats.nr_wakeups_migrate); #endif /* CONFIG_SMP */ __schedstat_inc(rq->ttwu_count); - __schedstat_inc(p->se.statistics.nr_wakeups); + __schedstat_inc(p->stats.nr_wakeups); if (wake_flags & WF_SYNC) - __schedstat_inc(p->se.statistics.nr_wakeups_sync); + __schedstat_inc(p->stats.nr_wakeups_sync); } /* @@ -4196,7 +4196,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p) #ifdef CONFIG_SCHEDSTATS /* Even if schedstat is disabled, there should not be garbage */ - memset(&p->se.statistics, 0, sizeof(p->se.statistics)); + memset(&p->stats, 0, sizeof(p->stats)); #endif RB_CLEAR_NODE(&p->dl.rb_node); @@ -9553,9 +9553,9 @@ void normalize_rt_tasks(void) continue; p->se.exec_start = 0; - schedstat_set(p->se.statistics.wait_start, 0); - schedstat_set(p->se.statistics.sleep_start, 0); - schedstat_set(p->se.statistics.block_start, 0); + schedstat_set(p->stats.wait_start, 0); + schedstat_set(p->stats.sleep_start, 0); + schedstat_set(p->stats.block_start, 0); if (!dl_task(p) && !rt_task(p)) { /* @@ -10397,11 +10397,14 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v) seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time); if (schedstat_enabled() && tg != &root_task_group) { + struct sched_statistics *stats; u64 ws = 0; int i; - for_each_possible_cpu(i) - ws += schedstat_val(tg->se[i]->statistics.wait_sum); + for_each_possible_cpu(i) { + stats = __schedstats_from_se(tg->se[i]); + ws += schedstat_val(stats->wait_sum); + } seq_printf(sf, "wait_sum %llu\n", ws); } diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index e94314633b39..51dd30990042 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1265,8 +1265,8 @@ static void update_curr_dl(struct rq *rq) return; } - schedstat_set(curr->se.statistics.exec_max, - max(curr->se.statistics.exec_max, delta_exec)); + schedstat_set(curr->stats.exec_max, + max(curr->stats.exec_max, delta_exec)); curr->se.sum_exec_runtime += delta_exec; account_group_exec_runtime(curr, delta_exec); diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 49716228efb4..da923347c8f3 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -440,11 +440,14 @@ void dirty_sched_domain_sysctl(int cpu) static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group *tg) { struct sched_entity *se = tg->se[cpu]; + struct sched_statistics *stats = __schedstats_from_se(se); #define P(F) SEQ_printf(m, " .%-30s: %lld\n", #F, (long long)F) -#define P_SCHEDSTAT(F) SEQ_printf(m, " .%-30s: %lld\n", #F, (long long)schedstat_val(F)) +#define P_SCHEDSTAT(F) SEQ_printf(m, " .%-30s: %lld\n", \ + #F, (long long)schedstat_val(stats->F)) #define PN(F) SEQ_printf(m, " .%-30s: %lld.%06ld\n", #F, SPLIT_NS((long long)F)) -#define PN_SCHEDSTAT(F) SEQ_printf(m, " .%-30s: %lld.%06ld\n", #F, SPLIT_NS((long long)schedstat_val(F))) +#define PN_SCHEDSTAT(F) SEQ_printf(m, " .%-30s: %lld.%06ld\n", \ + #F, SPLIT_NS((long long)schedstat_val(stats->F))) if (!se) return; @@ -454,16 +457,16 @@ static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group PN(se->sum_exec_runtime); if (schedstat_enabled()) { - PN_SCHEDSTAT(se->statistics.wait_start); - PN_SCHEDSTAT(se->statistics.sleep_start); - PN_SCHEDSTAT(se->statistics.block_start); - PN_SCHEDSTAT(se->statistics.sleep_max); - PN_SCHEDSTAT(se->statistics.block_max); - PN_SCHEDSTAT(se->statistics.exec_max); - PN_SCHEDSTAT(se->statistics.slice_max); - PN_SCHEDSTAT(se->statistics.wait_max); - PN_SCHEDSTAT(se->statistics.wait_sum); - P_SCHEDSTAT(se->statistics.wait_count); + PN_SCHEDSTAT(wait_start); + PN_SCHEDSTAT(sleep_start); + PN_SCHEDSTAT(block_start); + PN_SCHEDSTAT(sleep_max); + PN_SCHEDSTAT(block_max); + PN_SCHEDSTAT(exec_max); + PN_SCHEDSTAT(slice_max); + PN_SCHEDSTAT(wait_max); + PN_SCHEDSTAT(wait_sum); + P_SCHEDSTAT(wait_count); } P(se->load.weight); @@ -530,9 +533,9 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p) p->prio); SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld", - SPLIT_NS(schedstat_val_or_zero(p->se.statistics.wait_sum)), + SPLIT_NS(schedstat_val_or_zero(p->stats.wait_sum)), SPLIT_NS(p->se.sum_exec_runtime), - SPLIT_NS(schedstat_val_or_zero(p->se.statistics.sum_sleep_runtime))); + SPLIT_NS(schedstat_val_or_zero(p->stats.sum_sleep_runtime))); #ifdef CONFIG_NUMA_BALANCING SEQ_printf(m, " %d %d", task_node(p), task_numa_group_id(p)); @@ -948,8 +951,8 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns, "---------------------------------------------------------" "----------\n"); -#define P_SCHEDSTAT(F) __PS(#F, schedstat_val(p->F)) -#define PN_SCHEDSTAT(F) __PSN(#F, schedstat_val(p->F)) +#define P_SCHEDSTAT(F) __PS(#F, schedstat_val(p->stats.F)) +#define PN_SCHEDSTAT(F) __PSN(#F, schedstat_val(p->stats.F)) PN(se.exec_start); PN(se.vruntime); @@ -962,33 +965,33 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns, if (schedstat_enabled()) { u64 avg_atom, avg_per_cpu; - PN_SCHEDSTAT(se.statistics.sum_sleep_runtime); - PN_SCHEDSTAT(se.statistics.wait_start); - PN_SCHEDSTAT(se.statistics.sleep_start); - PN_SCHEDSTAT(se.statistics.block_start); - PN_SCHEDSTAT(se.statistics.sleep_max); - PN_SCHEDSTAT(se.statistics.block_max); - PN_SCHEDSTAT(se.statistics.exec_max); - PN_SCHEDSTAT(se.statistics.slice_max); - PN_SCHEDSTAT(se.statistics.wait_max); - PN_SCHEDSTAT(se.statistics.wait_sum); - P_SCHEDSTAT(se.statistics.wait_count); - PN_SCHEDSTAT(se.statistics.iowait_sum); - P_SCHEDSTAT(se.statistics.iowait_count); - P_SCHEDSTAT(se.statistics.nr_migrations_cold); - P_SCHEDSTAT(se.statistics.nr_failed_migrations_affine); - P_SCHEDSTAT(se.statistics.nr_failed_migrations_running); - P_SCHEDSTAT(se.statistics.nr_failed_migrations_hot); - P_SCHEDSTAT(se.statistics.nr_forced_migrations); - P_SCHEDSTAT(se.statistics.nr_wakeups); - P_SCHEDSTAT(se.statistics.nr_wakeups_sync); - P_SCHEDSTAT(se.statistics.nr_wakeups_migrate); - P_SCHEDSTAT(se.statistics.nr_wakeups_local); - P_SCHEDSTAT(se.statistics.nr_wakeups_remote); - P_SCHEDSTAT(se.statistics.nr_wakeups_affine); - P_SCHEDSTAT(se.statistics.nr_wakeups_affine_attempts); - P_SCHEDSTAT(se.statistics.nr_wakeups_passive); - P_SCHEDSTAT(se.statistics.nr_wakeups_idle); + PN_SCHEDSTAT(sum_sleep_runtime); + PN_SCHEDSTAT(wait_start); + PN_SCHEDSTAT(sleep_start); + PN_SCHEDSTAT(block_start); + PN_SCHEDSTAT(sleep_max); + PN_SCHEDSTAT(block_max); + PN_SCHEDSTAT(exec_max); + PN_SCHEDSTAT(slice_max); + PN_SCHEDSTAT(wait_max); + PN_SCHEDSTAT(wait_sum); + P_SCHEDSTAT(wait_count); + PN_SCHEDSTAT(iowait_sum); + P_SCHEDSTAT(iowait_count); + P_SCHEDSTAT(nr_migrations_cold); + P_SCHEDSTAT(nr_failed_migrations_affine); + P_SCHEDSTAT(nr_failed_migrations_running); + P_SCHEDSTAT(nr_failed_migrations_hot); + P_SCHEDSTAT(nr_forced_migrations); + P_SCHEDSTAT(nr_wakeups); + P_SCHEDSTAT(nr_wakeups_sync); + P_SCHEDSTAT(nr_wakeups_migrate); + P_SCHEDSTAT(nr_wakeups_local); + P_SCHEDSTAT(nr_wakeups_remote); + P_SCHEDSTAT(nr_wakeups_affine); + P_SCHEDSTAT(nr_wakeups_affine_attempts); + P_SCHEDSTAT(nr_wakeups_passive); + P_SCHEDSTAT(nr_wakeups_idle); avg_atom = p->se.sum_exec_runtime; if (nr_switches) @@ -1054,7 +1057,7 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns, void proc_sched_set_task(struct task_struct *p) { #ifdef CONFIG_SCHEDSTATS - memset(&p->se.statistics, 0, sizeof(p->se.statistics)); + memset(&p->stats, 0, sizeof(p->stats)); #endif } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1b92eec48745..8b4b97453cca 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -837,8 +837,12 @@ static void update_curr(struct cfs_rq *cfs_rq) curr->exec_start = now; - schedstat_set(curr->statistics.exec_max, - max(delta_exec, curr->statistics.exec_max)); + if (schedstat_enabled()) { + struct sched_statistics *stats = __schedstats_from_se(curr); + + __schedstat_set(stats->exec_max, + max(delta_exec, stats->exec_max)); + } curr->sum_exec_runtime += delta_exec; schedstat_add(cfs_rq->exec_clock, delta_exec); @@ -866,39 +870,45 @@ static inline void update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se) { u64 wait_start, prev_wait_start; + struct sched_statistics *stats; if (!schedstat_enabled()) return; + stats = __schedstats_from_se(se); + wait_start = rq_clock(rq_of(cfs_rq)); - prev_wait_start = schedstat_val(se->statistics.wait_start); + prev_wait_start = schedstat_val(stats->wait_start); if (entity_is_task(se) && task_on_rq_migrating(task_of(se)) && likely(wait_start > prev_wait_start)) wait_start -= prev_wait_start; - __schedstat_set(se->statistics.wait_start, wait_start); + __schedstat_set(stats->wait_start, wait_start); } static inline void update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se) { - struct task_struct *p; + struct sched_statistics *stats; + struct task_struct *p = NULL; u64 delta; if (!schedstat_enabled()) return; + stats = __schedstats_from_se(se); + /* * When the sched_schedstat changes from 0 to 1, some sched se * maybe already in the runqueue, the se->statistics.wait_start * will be 0.So it will let the delta wrong. We need to avoid this * scenario. */ - if (unlikely(!schedstat_val(se->statistics.wait_start))) + if (unlikely(!schedstat_val(stats->wait_start))) return; - delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(se->statistics.wait_start); + delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(stats->wait_start); if (entity_is_task(se)) { p = task_of(se); @@ -908,30 +918,33 @@ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se) * time stamp can be adjusted to accumulate wait time * prior to migration. */ - __schedstat_set(se->statistics.wait_start, delta); + __schedstat_set(stats->wait_start, delta); return; } trace_sched_stat_wait(p, delta); } - __schedstat_set(se->statistics.wait_max, - max(schedstat_val(se->statistics.wait_max), delta)); - __schedstat_inc(se->statistics.wait_count); - __schedstat_add(se->statistics.wait_sum, delta); - __schedstat_set(se->statistics.wait_start, 0); + __schedstat_set(stats->wait_max, + max(schedstat_val(stats->wait_max), delta)); + __schedstat_inc(stats->wait_count); + __schedstat_add(stats->wait_sum, delta); + __schedstat_set(stats->wait_start, 0); } static inline void update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se) { + struct sched_statistics *stats; struct task_struct *tsk = NULL; u64 sleep_start, block_start; if (!schedstat_enabled()) return; - sleep_start = schedstat_val(se->statistics.sleep_start); - block_start = schedstat_val(se->statistics.block_start); + stats = __schedstats_from_se(se); + + sleep_start = schedstat_val(stats->sleep_start); + block_start = schedstat_val(stats->block_start); if (entity_is_task(se)) tsk = task_of(se); @@ -942,11 +955,11 @@ update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se) if ((s64)delta < 0) delta = 0; - if (unlikely(delta > schedstat_val(se->statistics.sleep_max))) - __schedstat_set(se->statistics.sleep_max, delta); + if (unlikely(delta > schedstat_val(stats->sleep_max))) + __schedstat_set(stats->sleep_max, delta); - __schedstat_set(se->statistics.sleep_start, 0); - __schedstat_add(se->statistics.sum_sleep_runtime, delta); + __schedstat_set(stats->sleep_start, 0); + __schedstat_add(stats->sum_sleep_runtime, delta); if (tsk) { account_scheduler_latency(tsk, delta >> 10, 1); @@ -959,16 +972,16 @@ update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se) if ((s64)delta < 0) delta = 0; - if (unlikely(delta > schedstat_val(se->statistics.block_max))) - __schedstat_set(se->statistics.block_max, delta); + if (unlikely(delta > schedstat_val(stats->block_max))) + __schedstat_set(stats->block_max, delta); - __schedstat_set(se->statistics.block_start, 0); - __schedstat_add(se->statistics.sum_sleep_runtime, delta); + __schedstat_set(stats->block_start, 0); + __schedstat_add(stats->sum_sleep_runtime, delta); if (tsk) { if (tsk->in_iowait) { - __schedstat_add(se->statistics.iowait_sum, delta); - __schedstat_inc(se->statistics.iowait_count); + __schedstat_add(stats->iowait_sum, delta); + __schedstat_inc(stats->iowait_count); trace_sched_stat_iowait(tsk, delta); } @@ -1030,10 +1043,10 @@ update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) /* XXX racy against TTWU */ state = READ_ONCE(tsk->__state); if (state & TASK_INTERRUPTIBLE) - __schedstat_set(se->statistics.sleep_start, + __schedstat_set(tsk->stats.sleep_start, rq_clock(rq_of(cfs_rq))); if (state & TASK_UNINTERRUPTIBLE) - __schedstat_set(se->statistics.block_start, + __schedstat_set(tsk->stats.block_start, rq_clock(rq_of(cfs_rq))); } } @@ -4502,8 +4515,10 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) */ if (schedstat_enabled() && rq_of(cfs_rq)->cfs.load.weight >= 2*se->load.weight) { - __schedstat_set(se->statistics.slice_max, - max((u64)schedstat_val(se->statistics.slice_max), + struct sched_statistics *stats = __schedstats_from_se(se); + + __schedstat_set(stats->slice_max, + max((u64)schedstat_val(stats->slice_max), se->sum_exec_runtime - se->prev_sum_exec_runtime)); } @@ -5994,12 +6009,12 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, if (sched_feat(WA_WEIGHT) && target == nr_cpumask_bits) target = wake_affine_weight(sd, p, this_cpu, prev_cpu, sync); - schedstat_inc(p->se.statistics.nr_wakeups_affine_attempts); + schedstat_inc(p->stats.nr_wakeups_affine_attempts); if (target == nr_cpumask_bits) return prev_cpu; schedstat_inc(sd->ttwu_move_affine); - schedstat_inc(p->se.statistics.nr_wakeups_affine); + schedstat_inc(p->stats.nr_wakeups_affine); return target; } @@ -7803,7 +7818,7 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) if (!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)) { int cpu; - schedstat_inc(p->se.statistics.nr_failed_migrations_affine); + schedstat_inc(p->stats.nr_failed_migrations_affine); env->flags |= LBF_SOME_PINNED; @@ -7837,7 +7852,7 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) env->flags &= ~LBF_ALL_PINNED; if (task_running(env->src_rq, p)) { - schedstat_inc(p->se.statistics.nr_failed_migrations_running); + schedstat_inc(p->stats.nr_failed_migrations_running); return 0; } @@ -7859,12 +7874,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) env->sd->nr_balance_failed > env->sd->cache_nice_tries) { if (tsk_cache_hot == 1) { schedstat_inc(env->sd->lb_hot_gained[env->idle]); - schedstat_inc(p->se.statistics.nr_forced_migrations); + schedstat_inc(p->stats.nr_forced_migrations); } return 1; } - schedstat_inc(p->se.statistics.nr_failed_migrations_hot); + schedstat_inc(p->stats.nr_failed_migrations_hot); return 0; } @@ -11506,7 +11521,7 @@ int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) if (!cfs_rq) goto err; - se = kzalloc_node(sizeof(struct sched_entity), + se = kzalloc_node(sizeof(struct sched_entity_stats), GFP_KERNEL, cpu_to_node(i)); if (!se) goto err_free_rq; diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 3daf42a0f462..95a7c3ad2dc3 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1009,8 +1009,8 @@ static void update_curr_rt(struct rq *rq) if (unlikely((s64)delta_exec <= 0)) return; - schedstat_set(curr->se.statistics.exec_max, - max(curr->se.statistics.exec_max, delta_exec)); + schedstat_set(curr->stats.exec_max, + max(curr->stats.exec_max, delta_exec)); curr->se.sum_exec_runtime += delta_exec; account_group_exec_runtime(curr, delta_exec); diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index d8f8eb0c655b..fb6022e860af 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -41,6 +41,7 @@ rq_sched_info_dequeue(struct rq *rq, unsigned long long delta) #define schedstat_val_or_zero(var) ((schedstat_enabled()) ? (var) : 0) #else /* !CONFIG_SCHEDSTATS: */ + static inline void rq_sched_info_arrive (struct rq *rq, unsigned long long delta) { } static inline void rq_sched_info_dequeue(struct rq *rq, unsigned long long delta) { } static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delta) { } @@ -53,8 +54,26 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt # define schedstat_set(var, val) do { } while (0) # define schedstat_val(var) 0 # define schedstat_val_or_zero(var) 0 + #endif /* CONFIG_SCHEDSTATS */ +#ifdef CONFIG_FAIR_GROUP_SCHED +struct sched_entity_stats { + struct sched_entity se; + struct sched_statistics stats; +} __no_randomize_layout; +#endif + +static inline struct sched_statistics * +__schedstats_from_se(struct sched_entity *se) +{ +#ifdef CONFIG_FAIR_GROUP_SCHED + if (!entity_is_task(se)) + return &container_of(se, struct sched_entity_stats, se)->stats; +#endif + return &task_of(se)->stats; +} + #ifdef CONFIG_PSI /* * PSI tracks state that persists across sleeps, such as iowaits and diff --git a/kernel/sched/stop_task.c b/kernel/sched/stop_task.c index f988ebe3febb..0b165a25f22f 100644 --- a/kernel/sched/stop_task.c +++ b/kernel/sched/stop_task.c @@ -78,8 +78,8 @@ static void put_prev_task_stop(struct rq *rq, struct task_struct *prev) if (unlikely((s64)delta_exec < 0)) delta_exec = 0; - schedstat_set(curr->se.statistics.exec_max, - max(curr->se.statistics.exec_max, delta_exec)); + schedstat_set(curr->stats.exec_max, + max(curr->stats.exec_max, delta_exec)); curr->se.sum_exec_runtime += delta_exec; account_group_exec_runtime(curr, delta_exec); From patchwork Sun Sep 5 14:35:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507424 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7354C433F5 for ; Sun, 5 Sep 2021 14:36:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B151600D0 for ; Sun, 5 Sep 2021 14:36:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235549AbhIEOhW (ORCPT ); Sun, 5 Sep 2021 10:37:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235622AbhIEOhV (ORCPT ); Sun, 5 Sep 2021 10:37:21 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFA37C061575; Sun, 5 Sep 2021 07:36:18 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id v1so2392308plo.10; Sun, 05 Sep 2021 07:36:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I9E/2ChZxBzNPIZVVt0dF460jWVoyqMGuh7VA2HP2Vk=; b=Zaehv7XAeOvozkt73MD6CRNKvsccSqTiWYxZoG0fGI0iHirw/KG2msclBQMVTENdnW Ds1I7dWSvRD3WnSq4xGedxR4PN2O8y/sx7gx3B9GbfGd2qv1OmN/5GpZ5ax+FTTJOtWV VQ6NVwb0EuC38IthlFg5/DnX6DgxoejQceUc3GsjkT96PcXUC0tWBT+3qnhy2qPiC265 TZTZJdqReqWXh4MzhJ/TrbzGlui9btXJqjBOb/47lg9XHQuOm5+JoCZNZ/Xiu9JOjSLA cZJKrbddb73xeONay4vKUvNZtXlStchbEPRVHA/MKpipn4S/PEJaYinF2w/hFCL8JS5i NKdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I9E/2ChZxBzNPIZVVt0dF460jWVoyqMGuh7VA2HP2Vk=; b=ItOMhkj7DRvdiU0pyeRpX8vqLrnj/5hWIh8DCpM74qnym8Y3g/kspHpr4nbC1EOy5t XNNju2Ut6tqDs2RxlpYKuT5HLx1Ms1Pq93qdd7NgX4B7l0ggjVKKDC/VFMBNdnH4MjZG kuZZfT1Ttyga2pDFWuNBO6gNPbYuVeWD2UBuh0IPZUUMKNbki6N7OOyF85o/EqaAQtr9 Eaq6x5s9qBCo4ETzJyzpawlcLmK07HdetgkXA5ov+TZyvXi/i4u2oaYUEqK8qvHG39+e TgBfqv+GpsV4I7xm2B1SUYmPnHzEceKelzV+ba0nEtSgBR79bU8pce1/sWuVMK+GzRLU 1L0Q== X-Gm-Message-State: AOAM532V2VLQ14ELOKsG1wO0pkxxAVHCPyz7/RXV/3nZ+zG1EoInEjRg SjfdCN8DT8bDzEzpuDCRVH4= X-Google-Smtp-Source: ABdhPJwoUOpSFGa93VtnFjV7hasdMtCaExExZtgwCBeFQ6gMpwg/KpXuGl8X03hcYXoRlnS1v2sw4w== X-Received: by 2002:a17:90a:6f24:: with SMTP id d33mr9336595pjk.239.1630852578366; Sun, 05 Sep 2021 07:36:18 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:17 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao , kernel test robot Subject: [PATCH v4 3/8] sched: make schedstats helpers independent of fair sched class Date: Sun, 5 Sep 2021 14:35:42 +0000 Message-Id: <20210905143547.4668-4-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org The original prototype of the schedstats helpers are update_stats_wait_*(struct cfs_rq *cfs_rq, struct sched_entity *se) The cfs_rq in these helpers is used to get the rq_clock, and the se is used to get the struct sched_statistics and the struct task_struct. In order to make these helpers available by all sched classes, we can pass the rq, sched_statistics and task_struct directly. Then the new helpers are update_stats_wait_*(struct rq *rq, struct task_struct *p, struct sched_statistics *stats) which are independent of fair sched class. To avoid vmlinux growing too large or introducing ovehead when !schedstat_enabled(), some new helpers after schedstat_enabled() are also introduced, Suggested by Mel. These helpers are in sched/stats.c, __update_stats_wait_*(struct rq *rq, struct task_struct *p, struct sched_statistics *stats) The size of vmlinux as follows, Before After Size of vmlinux 826308552 826304640 The size is a litte smaller as some functions are not inlined again after the change. I also compared the sched performance with 'perf bench sched pipe', suggested by Mel. The result as followsi (in usecs/op), Before After kernel.sched_schedstats=0 5.2~5.4 5.2~5.4 kernel.sched_schedstats=1 5.3~5.5 5.3~5.5 [These data is a little difference with the prev version, that is because my old test machine is destroyed so I have to use a new different test machine.] Almost no difference. No functional change. [lkp@intel.com: reported build failure in prev version] Signed-off-by: Yafang Shao Acked-by: Mel Gorman Cc: kernel test robot Cc: Alison Chaiken --- kernel/sched/debug.c | 4 +- kernel/sched/fair.c | 140 +++++++------------------------------------ kernel/sched/stats.c | 103 +++++++++++++++++++++++++++++++ kernel/sched/stats.h | 30 ++++++++++ 4 files changed, 159 insertions(+), 118 deletions(-) diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index da923347c8f3..e08eee374176 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -439,8 +439,10 @@ void dirty_sched_domain_sysctl(int cpu) #ifdef CONFIG_FAIR_GROUP_SCHED static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group *tg) { + struct sched_statistics __maybe_unused *stats; struct sched_entity *se = tg->se[cpu]; - struct sched_statistics *stats = __schedstats_from_se(se); + + stats = __schedstats_from_se(se); #define P(F) SEQ_printf(m, " .%-30s: %lld\n", #F, (long long)F) #define P_SCHEDSTAT(F) SEQ_printf(m, " .%-30s: %lld\n", \ diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8b4b97453cca..219ad90a1762 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -838,8 +838,9 @@ static void update_curr(struct cfs_rq *cfs_rq) curr->exec_start = now; if (schedstat_enabled()) { - struct sched_statistics *stats = __schedstats_from_se(curr); + struct sched_statistics __maybe_unused *stats; + stats = __schedstats_from_se(curr); __schedstat_set(stats->exec_max, max(delta_exec, stats->exec_max)); } @@ -867,32 +868,27 @@ static void update_curr_fair(struct rq *rq) } static inline void -update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se) +update_stats_wait_start_fair(struct cfs_rq *cfs_rq, struct sched_entity *se) { - u64 wait_start, prev_wait_start; struct sched_statistics *stats; + struct task_struct *p = NULL; if (!schedstat_enabled()) return; stats = __schedstats_from_se(se); - wait_start = rq_clock(rq_of(cfs_rq)); - prev_wait_start = schedstat_val(stats->wait_start); - - if (entity_is_task(se) && task_on_rq_migrating(task_of(se)) && - likely(wait_start > prev_wait_start)) - wait_start -= prev_wait_start; + if (entity_is_task(se)) + p = task_of(se); - __schedstat_set(stats->wait_start, wait_start); + __update_stats_wait_start(rq_of(cfs_rq), p, stats); } static inline void -update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se) +update_stats_wait_end_fair(struct cfs_rq *cfs_rq, struct sched_entity *se) { struct sched_statistics *stats; struct task_struct *p = NULL; - u64 delta; if (!schedstat_enabled()) return; @@ -908,105 +904,34 @@ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se) if (unlikely(!schedstat_val(stats->wait_start))) return; - delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(stats->wait_start); - - if (entity_is_task(se)) { + if (entity_is_task(se)) p = task_of(se); - if (task_on_rq_migrating(p)) { - /* - * Preserve migrating task's wait time so wait_start - * time stamp can be adjusted to accumulate wait time - * prior to migration. - */ - __schedstat_set(stats->wait_start, delta); - return; - } - trace_sched_stat_wait(p, delta); - } - __schedstat_set(stats->wait_max, - max(schedstat_val(stats->wait_max), delta)); - __schedstat_inc(stats->wait_count); - __schedstat_add(stats->wait_sum, delta); - __schedstat_set(stats->wait_start, 0); + __update_stats_wait_end(rq_of(cfs_rq), p, stats); } static inline void -update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se) +update_stats_enqueue_sleeper_fair(struct cfs_rq *cfs_rq, struct sched_entity *se) { struct sched_statistics *stats; struct task_struct *tsk = NULL; - u64 sleep_start, block_start; if (!schedstat_enabled()) return; stats = __schedstats_from_se(se); - sleep_start = schedstat_val(stats->sleep_start); - block_start = schedstat_val(stats->block_start); - if (entity_is_task(se)) tsk = task_of(se); - if (sleep_start) { - u64 delta = rq_clock(rq_of(cfs_rq)) - sleep_start; - - if ((s64)delta < 0) - delta = 0; - - if (unlikely(delta > schedstat_val(stats->sleep_max))) - __schedstat_set(stats->sleep_max, delta); - - __schedstat_set(stats->sleep_start, 0); - __schedstat_add(stats->sum_sleep_runtime, delta); - - if (tsk) { - account_scheduler_latency(tsk, delta >> 10, 1); - trace_sched_stat_sleep(tsk, delta); - } - } - if (block_start) { - u64 delta = rq_clock(rq_of(cfs_rq)) - block_start; - - if ((s64)delta < 0) - delta = 0; - - if (unlikely(delta > schedstat_val(stats->block_max))) - __schedstat_set(stats->block_max, delta); - - __schedstat_set(stats->block_start, 0); - __schedstat_add(stats->sum_sleep_runtime, delta); - - if (tsk) { - if (tsk->in_iowait) { - __schedstat_add(stats->iowait_sum, delta); - __schedstat_inc(stats->iowait_count); - trace_sched_stat_iowait(tsk, delta); - } - - trace_sched_stat_blocked(tsk, delta); - - /* - * Blocking time is in units of nanosecs, so shift by - * 20 to get a milliseconds-range estimation of the - * amount of time that the task spent sleeping: - */ - if (unlikely(prof_on == SLEEP_PROFILING)) { - profile_hits(SLEEP_PROFILING, - (void *)get_wchan(tsk), - delta >> 20); - } - account_scheduler_latency(tsk, delta >> 10, 0); - } - } + __update_stats_enqueue_sleeper(rq_of(cfs_rq), tsk, stats); } /* * Task is being enqueued - update stats: */ static inline void -update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) +update_stats_enqueue_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { if (!schedstat_enabled()) return; @@ -1016,14 +941,14 @@ update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) * a dequeue/enqueue event is a NOP) */ if (se != cfs_rq->curr) - update_stats_wait_start(cfs_rq, se); + update_stats_wait_start_fair(cfs_rq, se); if (flags & ENQUEUE_WAKEUP) - update_stats_enqueue_sleeper(cfs_rq, se); + update_stats_enqueue_sleeper_fair(cfs_rq, se); } static inline void -update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) +update_stats_dequeue_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { if (!schedstat_enabled()) @@ -1034,7 +959,7 @@ update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) * waiting task: */ if (se != cfs_rq->curr) - update_stats_wait_end(cfs_rq, se); + update_stats_wait_end_fair(cfs_rq, se); if ((flags & DEQUEUE_SLEEP) && entity_is_task(se)) { struct task_struct *tsk = task_of(se); @@ -4238,26 +4163,6 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) static void check_enqueue_throttle(struct cfs_rq *cfs_rq); -static inline void check_schedstat_required(void) -{ -#ifdef CONFIG_SCHEDSTATS - if (schedstat_enabled()) - return; - - /* Force schedstat enabled if a dependent tracepoint is active */ - if (trace_sched_stat_wait_enabled() || - trace_sched_stat_sleep_enabled() || - trace_sched_stat_iowait_enabled() || - trace_sched_stat_blocked_enabled() || - trace_sched_stat_runtime_enabled()) { - printk_deferred_once("Scheduler tracepoints stat_sleep, stat_iowait, " - "stat_blocked and stat_runtime require the " - "kernel parameter schedstats=enable or " - "kernel.sched_schedstats=1\n"); - } -#endif -} - static inline bool cfs_bandwidth_used(void); /* @@ -4331,7 +4236,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) place_entity(cfs_rq, se, 0); check_schedstat_required(); - update_stats_enqueue(cfs_rq, se, flags); + update_stats_enqueue_fair(cfs_rq, se, flags); check_spread(cfs_rq, se); if (!curr) __enqueue_entity(cfs_rq, se); @@ -4415,7 +4320,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) update_load_avg(cfs_rq, se, UPDATE_TG); se_update_runnable(se); - update_stats_dequeue(cfs_rq, se, flags); + update_stats_dequeue_fair(cfs_rq, se, flags); clear_buddies(cfs_rq, se); @@ -4500,7 +4405,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) * a CPU. So account for the time it spent waiting on the * runqueue. */ - update_stats_wait_end(cfs_rq, se); + update_stats_wait_end_fair(cfs_rq, se); __dequeue_entity(cfs_rq, se); update_load_avg(cfs_rq, se, UPDATE_TG); } @@ -4515,8 +4420,9 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) */ if (schedstat_enabled() && rq_of(cfs_rq)->cfs.load.weight >= 2*se->load.weight) { - struct sched_statistics *stats = __schedstats_from_se(se); + struct sched_statistics __maybe_unused *stats; + stats = __schedstats_from_se(se); __schedstat_set(stats->slice_max, max((u64)schedstat_val(stats->slice_max), se->sum_exec_runtime - se->prev_sum_exec_runtime)); @@ -4601,7 +4507,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev) check_spread(cfs_rq, prev); if (prev->on_rq) { - update_stats_wait_start(cfs_rq, prev); + update_stats_wait_start_fair(cfs_rq, prev); /* Put 'current' back into the tree. */ __enqueue_entity(cfs_rq, prev); /* in !on_rq case, update occurred at dequeue */ diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c index 3f93fc3b5648..fad781ca7791 100644 --- a/kernel/sched/stats.c +++ b/kernel/sched/stats.c @@ -4,6 +4,109 @@ */ #include "sched.h" +void __update_stats_wait_start(struct rq *rq, struct task_struct *p, + struct sched_statistics *stats) +{ + u64 wait_start, prev_wait_start; + + wait_start = rq_clock(rq); + prev_wait_start = schedstat_val(stats->wait_start); + + if (p && likely(wait_start > prev_wait_start)) + wait_start -= prev_wait_start; + + __schedstat_set(stats->wait_start, wait_start); +} + +void __update_stats_wait_end(struct rq *rq, struct task_struct *p, + struct sched_statistics *stats) +{ + u64 delta = rq_clock(rq) - schedstat_val(stats->wait_start); + + if (p) { + if (task_on_rq_migrating(p)) { + /* + * Preserve migrating task's wait time so wait_start + * time stamp can be adjusted to accumulate wait time + * prior to migration. + */ + __schedstat_set(stats->wait_start, delta); + + return; + } + + trace_sched_stat_wait(p, delta); + } + + __schedstat_set(stats->wait_max, + max(schedstat_val(stats->wait_max), delta)); + __schedstat_inc(stats->wait_count); + __schedstat_add(stats->wait_sum, delta); + __schedstat_set(stats->wait_start, 0); +} + +void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p, + struct sched_statistics *stats) +{ + u64 sleep_start, block_start; + + sleep_start = schedstat_val(stats->sleep_start); + block_start = schedstat_val(stats->block_start); + + if (sleep_start) { + u64 delta = rq_clock(rq) - sleep_start; + + if ((s64)delta < 0) + delta = 0; + + if (unlikely(delta > schedstat_val(stats->sleep_max))) + __schedstat_set(stats->sleep_max, delta); + + __schedstat_set(stats->sleep_start, 0); + __schedstat_add(stats->sum_sleep_runtime, delta); + + if (p) { + account_scheduler_latency(p, delta >> 10, 1); + trace_sched_stat_sleep(p, delta); + } + } + + if (block_start) { + u64 delta = rq_clock(rq) - block_start; + + if ((s64)delta < 0) + delta = 0; + + if (unlikely(delta > schedstat_val(stats->block_max))) + __schedstat_set(stats->block_max, delta); + + __schedstat_set(stats->block_start, 0); + __schedstat_add(stats->sum_sleep_runtime, delta); + + if (p) { + if (p->in_iowait) { + __schedstat_add(stats->iowait_sum, delta); + __schedstat_inc(stats->iowait_count); + trace_sched_stat_iowait(p, delta); + } + + trace_sched_stat_blocked(p, delta); + + /* + * Blocking time is in units of nanosecs, so shift by + * 20 to get a milliseconds-range estimation of the + * amount of time that the task spent sleeping: + */ + if (unlikely(prof_on == SLEEP_PROFILING)) { + profile_hits(SLEEP_PROFILING, + (void *)get_wchan(p), + delta >> 20); + } + account_scheduler_latency(p, delta >> 10, 0); + } + } +} + /* * Current schedstat API version. * diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index fb6022e860af..cfb0893a83d4 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -2,6 +2,8 @@ #ifdef CONFIG_SCHEDSTATS +extern struct static_key_false sched_schedstats; + /* * Expects runqueue lock to be held for atomicity of update */ @@ -40,6 +42,29 @@ rq_sched_info_dequeue(struct rq *rq, unsigned long long delta) #define schedstat_val(var) (var) #define schedstat_val_or_zero(var) ((schedstat_enabled()) ? (var) : 0) +void __update_stats_wait_start(struct rq *rq, struct task_struct *p, + struct sched_statistics *stats); + +void __update_stats_wait_end(struct rq *rq, struct task_struct *p, + struct sched_statistics *stats); +void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p, + struct sched_statistics *stats); + +static inline void +check_schedstat_required(void) +{ + if (schedstat_enabled()) + return; + + /* Force schedstat enabled if a dependent tracepoint is active */ + if (trace_sched_stat_wait_enabled() || + trace_sched_stat_sleep_enabled() || + trace_sched_stat_iowait_enabled() || + trace_sched_stat_blocked_enabled() || + trace_sched_stat_runtime_enabled()) + printk_deferred_once("Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1\n"); +} + #else /* !CONFIG_SCHEDSTATS: */ static inline void rq_sched_info_arrive (struct rq *rq, unsigned long long delta) { } @@ -55,6 +80,11 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt # define schedstat_val(var) 0 # define schedstat_val_or_zero(var) 0 +# define __update_stats_wait_start(rq, p, stats) do { } while (0) +# define __update_stats_wait_end(rq, p, stats) do { } while (0) +# define __update_stats_enqueue_sleeper(rq, p, stats) do { } while (0) +# define check_schedstat_required() do { } while (0) + #endif /* CONFIG_SCHEDSTATS */ #ifdef CONFIG_FAIR_GROUP_SCHED From patchwork Sun Sep 5 14:35:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507295 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCCA1C433EF for ; Sun, 5 Sep 2021 14:36:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9FEBE60F6D for ; Sun, 5 Sep 2021 14:36:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236156AbhIEOh2 (ORCPT ); Sun, 5 Sep 2021 10:37:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235961AbhIEOhZ (ORCPT ); Sun, 5 Sep 2021 10:37:25 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 911DFC061757; Sun, 5 Sep 2021 07:36:22 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id c17so4088871pgc.0; Sun, 05 Sep 2021 07:36:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YiTgjWXCRRctYdc9wpYqv69fBWjIeBt2mt1XO567JxI=; b=Lp8H/Krb9scczsKKPD6U1HNGshnMN6xcnV+WPJP1WH9p7syj1d9bZnYCbREVxZ3DEW WL7KofpFwTeZgd4EnERkxy15HINul7NqqewjaeZn06wTjfJXWGijLmwugBJ8IjBVMNfM bhfjFGG6gr/ZrehlJHRWoi4QA2eUJhyvn5Jd7qDQDOBChGhIpbA03tl1FcUbCs4XilWZ UlbxsXsC9YoheyRFE/RgpvnvuWTegOKS1kw4hIm7viODXoyj2rgrSbRE9Ptj1M80b8oM Dsd3dCr6h4W0zt5+V6870XBlwVQ6lwPjJ7k4IFysD9dupztoYSgCwa+dnvYvxwgZcwL7 Sq1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YiTgjWXCRRctYdc9wpYqv69fBWjIeBt2mt1XO567JxI=; b=lPVDHEqb5dkTy3zBfVy79DhdEJxNJY3G7/pVEp0IdIHhv4FhKWaf9EHPCprzEto07Y A+KKVN1ee/ZT2miwWeurbcoi9CZecXLHEcqjE5YdKLQMrcMeDqwPS0ZVfi5yzszoswi3 czuLuSJcNf3h0f/n04zpeNy2nGAkgI7neqGfo43DV5ZR7VBH1Z/UlF+TNUMk02aw8Oet O8OcehyGmbUTXZZg9SvPiC0fq02AD5xO3ohxfPZsaTIpTxu2nxAvhLzoyZTuZitHeQNU ZO59kgFKPRSvYseYgzaVcKbY/hggeLvt/6KJzdbDahPG42Bo+xP/t7BQMaqxzRQart1W 7PcQ== X-Gm-Message-State: AOAM530ILXIi07byXja8psFNNMoGY1uGjYNaEiVEWDWoJL6Qib+vRGno NfWRFRRZa5L1wBDZNLe7n+WSDeCCt+XiHnoZ X-Google-Smtp-Source: ABdhPJw/w+mWZywpHGZ6uIUK0NXVbmCMjYpjYsNf5vqnHYkq342dn4fjasTq+Fvb5l47ewYy5D5uLg== X-Received: by 2002:aa7:8e56:0:b029:3cd:c2ec:6c1c with SMTP id d22-20020aa78e560000b02903cdc2ec6c1cmr7681973pfr.80.1630852582162; Sun, 05 Sep 2021 07:36:22 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:21 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao Subject: [PATCH v4 4/8] sched: introduce task block time in schedstats Date: Sun, 5 Sep 2021 14:35:43 +0000 Message-Id: <20210905143547.4668-5-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org Currently in schedstats we have sum_sleep_runtime and iowait_sum, but there's no metric to show how long the task is in D state. Once a task in D state, it means the task is blocked in the kernel, for example the task may be waiting for a mutex. The D state is more frequent than iowait, and it is more critital than S state. So it is worth to add a metric to measure it. Signed-off-by: Yafang Shao Cc: Mel Gorman Cc: Alison Chaiken --- include/linux/sched.h | 2 ++ kernel/sched/debug.c | 6 ++++-- kernel/sched/stats.c | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index ed4293812ca9..4239a3fbe7f3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -502,6 +502,8 @@ struct sched_statistics { u64 block_start; u64 block_max; + s64 sum_block_runtime; + u64 exec_max; u64 slice_max; diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index e08eee374176..32eb86932bcb 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -534,10 +534,11 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p) (long long)(p->nvcsw + p->nivcsw), p->prio); - SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld", + SEQ_printf(m, "%9lld.%06ld %9lld.%06ld %9lld.%06ld %9lld.%06ld", SPLIT_NS(schedstat_val_or_zero(p->stats.wait_sum)), SPLIT_NS(p->se.sum_exec_runtime), - SPLIT_NS(schedstat_val_or_zero(p->stats.sum_sleep_runtime))); + SPLIT_NS(schedstat_val_or_zero(p->stats.sum_sleep_runtime)), + SPLIT_NS(schedstat_val_or_zero(p->stats.sum_block_runtime))); #ifdef CONFIG_NUMA_BALANCING SEQ_printf(m, " %d %d", task_node(p), task_numa_group_id(p)); @@ -968,6 +969,7 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns, u64 avg_atom, avg_per_cpu; PN_SCHEDSTAT(sum_sleep_runtime); + PN_SCHEDSTAT(sum_block_runtime); PN_SCHEDSTAT(wait_start); PN_SCHEDSTAT(sleep_start); PN_SCHEDSTAT(block_start); diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c index fad781ca7791..07dde2928c79 100644 --- a/kernel/sched/stats.c +++ b/kernel/sched/stats.c @@ -82,6 +82,7 @@ void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p, __schedstat_set(stats->block_start, 0); __schedstat_add(stats->sum_sleep_runtime, delta); + __schedstat_add(stats->sum_block_runtime, delta); if (p) { if (p->in_iowait) { From patchwork Sun Sep 5 14:35:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5BF1C433FE for ; Sun, 5 Sep 2021 14:36:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A87AB60F92 for ; Sun, 5 Sep 2021 14:36:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236037AbhIEOhc (ORCPT ); Sun, 5 Sep 2021 10:37:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235968AbhIEOh3 (ORCPT ); Sun, 5 Sep 2021 10:37:29 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67A58C061757; Sun, 5 Sep 2021 07:36:26 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id 8so4051447pga.7; Sun, 05 Sep 2021 07:36:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4DXvdN5Je/o6na7RsjQfMKd0MSHmdj3a0FLf0NHnXF4=; b=lXNmfUcFH9Oq/SfU45zZcu6vIYtCp1x2eE4PX+2ldK3SypTLbZ2bpgD/NyHn9uu7TR hG4TNzcl1KxSoB/p63ZTNDVYriOd8YAvynND4Pmi3LMb9FP9Oxvj/uN8IxIDdEcZxNno U818Oxt5El5L7Q837RHJ6PUjeCIDoOS5tqilUA/Tfs0RtcWQ3+W0MZGxq0PNKhk1EOBx HAIyoc/meMiS8bRcm+y0OjA8LTWic1YXClWw1TnF5Qk3hWIrzDrWZ5x8GX7YbL6MWQ6J gF+1GJ7ZflX4w2C34qXb8Dqs8cqQMhSv3tG8AoB56KVYZqIisJ+uzE+8WmqjNhX1eHfi k8Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4DXvdN5Je/o6na7RsjQfMKd0MSHmdj3a0FLf0NHnXF4=; b=n5LxjlkEzAltf61Qmdev4qCDoCvYFwNZ6yUzpqtVKm9j7p9SWDvcIZEY0v1J1tyg1D +pXyHwNBBY/YsJzafrYOqSa+ZCQP8QO9RdraK39J0aJqA1hJui/bhlH4wf9a5N4U78n+ ApoU0g/W2MC5ROfBxa8mAx2X530qG+vn6ktRVSzpopOZVVHvYwhd7zeXxD+HOx5pwWL9 hAOJh1y+81AZeU+x+rb4R3U/PC1fjW62RRl0QOtC7D5mFM8GhjbMNWmKgTQY/jGzp452 iMwnKrrkEnmIZMceq+HpFBzarvhSBoX6WmYYJFxb5BB0D+ZeXU+odIaNajPzHGTSzh1B sO8A== X-Gm-Message-State: AOAM532PdI9zZWoTn/oSaL0vCWp9PUatqACFGHrv1Z+UhvF8n205CDg+ 0XG0uu6j+ij3MXIl7AEgimc= X-Google-Smtp-Source: ABdhPJyb5Odr1iVwOEDTPqdg0dibTA5Q1dFra1kYfUopkWO020UaywGrobOYS9eTTXmk3mB0W83LUg== X-Received: by 2002:aa7:8b07:0:b029:3c7:c29f:9822 with SMTP id f7-20020aa78b070000b02903c7c29f9822mr7878850pfd.33.1630852585996; Sun, 05 Sep 2021 07:36:25 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:25 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao Subject: [PATCH v4 5/8] sched, rt: support sched_stat_runtime tracepoint for RT sched class Date: Sun, 5 Sep 2021 14:35:44 +0000 Message-Id: <20210905143547.4668-6-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org The runtime of a RT task has already been there, so we only need to add a tracepoint. One difference between fair task and RT task is that there is no vruntime in RT task. To reuse the sched_stat_runtime tracepoint, '0' is passed as vruntime for RT task. The output of this tracepoint for RT task as follows, stress-9748 [039] d.h. 113.519352: sched_stat_runtime: comm=stress pid=9748 runtime=997573 [ns] vruntime=0 [ns] stress-9748 [039] d.h. 113.520352: sched_stat_runtime: comm=stress pid=9748 runtime=997627 [ns] vruntime=0 [ns] stress-9748 [039] d.h. 113.521352: sched_stat_runtime: comm=stress pid=9748 runtime=998203 [ns] vruntime=0 [ns] Signed-off-by: Yafang Shao Cc: Mel Gorman Cc: Alison Chaiken --- kernel/sched/rt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 95a7c3ad2dc3..5d251112e51c 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1012,6 +1012,8 @@ static void update_curr_rt(struct rq *rq) schedstat_set(curr->stats.exec_max, max(curr->stats.exec_max, delta_exec)); + trace_sched_stat_runtime(curr, delta_exec, 0); + curr->se.sum_exec_runtime += delta_exec; account_group_exec_runtime(curr, delta_exec); From patchwork Sun Sep 5 14:35:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507294 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63DC3C4332F for ; Sun, 5 Sep 2021 14:36:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A1D5600D0 for ; Sun, 5 Sep 2021 14:36:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235968AbhIEOhe (ORCPT ); Sun, 5 Sep 2021 10:37:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236733AbhIEOhd (ORCPT ); Sun, 5 Sep 2021 10:37:33 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72979C061575; Sun, 5 Sep 2021 07:36:30 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id c17so4089067pgc.0; Sun, 05 Sep 2021 07:36:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WW7v+yEz4Kwl/5xSjXrMs7ysyHAEbAfgGR2qFWeYHYw=; b=TRfYx3k/pFPsxq2BcVNRR8q8dm8+RPgbjLAB9hnR4Pt4R4sPGdJzF47UG0a3X9UYmi 1/yDWv8hkZv+EZauTlFhTzP5EFR4rmlaVgDqPqG0myapiVMNIMR2fMQuo9grMCM8NKNX yg5uzymFHUN5HxJ/FjCX9/FgjPgXuzhhJe3/lcKmyblkvTKXfs+zOwWsUB5aIP3zDu16 CfUMw1TPCJ0BmbkS8FmIM9ee8Te+o6PlJaQ2uKEGBUQRhFhPG5EWKK952ke35Bbm4c4R AV+ppt681MrDXdiUycz+aazBjQtXukbnd/tyfYwvbo7nOopHij8EErOjkQZmM8qN/UEX M3yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WW7v+yEz4Kwl/5xSjXrMs7ysyHAEbAfgGR2qFWeYHYw=; b=rRnQ3TBzP6HbT9DC7LfzhmXydMa7wUKjWXLFvYRd1C1oclLp208GhXUXDbT37ZyJ0Z c3f471GfoqVk/9ioQnK76gOMWBiwQ89PAcvOzYmjpbc3pMzcSqow6zhXjGfJHAA6b4d1 fCIAm8Jvak2imOgpvvwh/OjLnJpru6ka9fThU55NYNTwOcr/9ebCvW6Bl/E5LVJytCXY xZSPlhLR2no4apsEyaVH3vhNghncTSFRLAtEGaSWzqaaZhd9PCpFhDuuCt+mNr/oX77K E1ZRRiZg5HqqBKtbBHo1h+qCzDblYB30s5gz77IW5azt3rqOf+G7AM/DOJhQw8yQMnLM tQ6A== X-Gm-Message-State: AOAM532s2M9Rt3RMR5AfMN5SM3e5P/I1K+i/QF5uzyZ+xejh0mkEspu/ BjhMgBO71EXZC0yGn9g5M8k= X-Google-Smtp-Source: ABdhPJyB+uDaSMcLttmMFYtpWUIihHcdVnLIEjFJbOPIF56vdrQBHfHl/6wcUgXZf4jbuIVw/EZhjA== X-Received: by 2002:aa7:8510:0:b0:3f8:f367:4890 with SMTP id v16-20020aa78510000000b003f8f3674890mr7827943pfn.49.1630852589994; Sun, 05 Sep 2021 07:36:29 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:29 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao , kernel test robot Subject: [PATCH v4 6/8] sched, rt: support schedstats for RT sched class Date: Sun, 5 Sep 2021 14:35:45 +0000 Message-Id: <20210905143547.4668-7-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org We want to measure the latency of RT tasks in our production environment with schedstats facility, but currently schedstats is only supported for fair sched class. This patch enable it for RT sched class as well. After we make the struct sched_statistics and the helpers of it independent of fair sched class, we can easily use the schedstats facility for RT sched class. The schedstat usage in RT sched class is similar with fair sched class, for example, fair RT enqueue update_stats_enqueue_fair update_stats_enqueue_rt dequeue update_stats_dequeue_fair update_stats_dequeue_rt put_prev_task update_stats_wait_start update_stats_wait_start_rt set_next_task update_stats_wait_end update_stats_wait_end_rt The user can get the schedstats information in the same way in fair sched class. For example, fair RT /proc/[pid]/sched /proc/[pid]/sched schedstats is not supported for RT group. The output of a RT task's schedstats as follows, $ cat /proc/10349/sched ... sum_sleep_runtime : 972.434535 sum_block_runtime : 960.433522 wait_start : 188510.871584 sleep_start : 0.000000 block_start : 0.000000 sleep_max : 12.001013 block_max : 952.660622 exec_max : 0.049629 slice_max : 0.000000 wait_max : 0.018538 wait_sum : 0.424340 wait_count : 49 iowait_sum : 956.495640 iowait_count : 24 nr_migrations_cold : 0 nr_failed_migrations_affine : 0 nr_failed_migrations_running : 0 nr_failed_migrations_hot : 0 nr_forced_migrations : 0 nr_wakeups : 49 nr_wakeups_sync : 0 nr_wakeups_migrate : 0 nr_wakeups_local : 49 nr_wakeups_remote : 0 nr_wakeups_affine : 0 nr_wakeups_affine_attempts : 0 nr_wakeups_passive : 0 nr_wakeups_idle : 0 ... The sched:sched_stat_{wait, sleep, iowait, blocked} tracepoints can be used to trace RT tasks as well. The output of these tracepoints for a RT tasks as follows, - runtime stress-10352 [004] d.h. 1035.382286: sched_stat_runtime: comm=stress pid=10352 runtime=995769 [ns] vruntime=0 [ns] [vruntime=0 means it is a RT task] - wait -0 [004] dN.. 1227.688544: sched_stat_wait: comm=stress pid=10352 delay=46849882 [ns] - blocked kworker/4:1-465 [004] dN.. 1585.676371: sched_stat_blocked: comm=stress pid=17194 delay=189963 [ns] - iowait kworker/4:1-465 [004] dN.. 1585.675330: sched_stat_iowait: comm=stress pid=17189 delay=182848 [ns] - sleep sleep-18194 [023] dN.. 1780.891840: sched_stat_sleep: comm=sleep.sh pid=17767 delay=1001160770 [ns] sleep-18196 [023] dN.. 1781.893208: sched_stat_sleep: comm=sleep.sh pid=17767 delay=1001161970 [ns] sleep-18197 [023] dN.. 1782.894544: sched_stat_sleep: comm=sleep.sh pid=17767 delay=1001128840 [ns] [ In sleep.sh, it sleeps 1 sec each time. ] [lkp@intel.com: reported build failure in earlier version] Signed-off-by: Yafang Shao Cc: kernel test robot Cc: Mel Gorman Cc: Alison Chaiken --- kernel/sched/rt.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 5d251112e51c..bb945f8faeca 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1273,6 +1273,112 @@ static void __delist_rt_entity(struct sched_rt_entity *rt_se, struct rt_prio_arr rt_se->on_list = 0; } +static inline struct sched_statistics * +__schedstats_from_rt_se(struct sched_rt_entity *rt_se) +{ +#ifdef CONFIG_RT_GROUP_SCHED + /* schedstats is not supported for rt group. */ + if (!rt_entity_is_task(rt_se)) + return NULL; +#endif + + return &rt_task_of(rt_se)->stats; +} + +static inline void +update_stats_wait_start_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se) +{ + struct sched_statistics *stats; + struct task_struct *p = NULL; + + if (!schedstat_enabled()) + return; + + if (rt_entity_is_task(rt_se)) + p = rt_task_of(rt_se); + + stats = __schedstats_from_rt_se(rt_se); + if (!stats) + return; + + __update_stats_wait_start(rq_of_rt_rq(rt_rq), p, stats); +} + +static inline void +update_stats_enqueue_sleeper_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se) +{ + struct sched_statistics *stats; + struct task_struct *p = NULL; + + if (!schedstat_enabled()) + return; + + if (rt_entity_is_task(rt_se)) + p = rt_task_of(rt_se); + + stats = __schedstats_from_rt_se(rt_se); + if (!stats) + return; + + __update_stats_enqueue_sleeper(rq_of_rt_rq(rt_rq), p, stats); +} + +static inline void +update_stats_enqueue_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se, + int flags) +{ + if (!schedstat_enabled()) + return; + + if (flags & ENQUEUE_WAKEUP) + update_stats_enqueue_sleeper_rt(rt_rq, rt_se); +} + +static inline void +update_stats_wait_end_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se) +{ + struct sched_statistics *stats; + struct task_struct *p = NULL; + + if (!schedstat_enabled()) + return; + + if (rt_entity_is_task(rt_se)) + p = rt_task_of(rt_se); + + stats = __schedstats_from_rt_se(rt_se); + if (!stats) + return; + + __update_stats_wait_end(rq_of_rt_rq(rt_rq), p, stats); +} + +static inline void +update_stats_dequeue_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se, + int flags) +{ + struct task_struct *p = NULL; + + if (!schedstat_enabled()) + return; + + if (rt_entity_is_task(rt_se)) + p = rt_task_of(rt_se); + + if ((flags & DEQUEUE_SLEEP) && p) { + unsigned int state; + + state = READ_ONCE(p->__state); + if (state & TASK_INTERRUPTIBLE) + __schedstat_set(p->stats.sleep_start, + rq_clock(rq_of_rt_rq(rt_rq))); + + if (state & TASK_UNINTERRUPTIBLE) + __schedstat_set(p->stats.block_start, + rq_clock(rq_of_rt_rq(rt_rq))); + } +} + static void __enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags) { struct rt_rq *rt_rq = rt_rq_of_se(rt_se); @@ -1346,6 +1452,8 @@ static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags) { struct rq *rq = rq_of_rt_se(rt_se); + update_stats_enqueue_rt(rt_rq_of_se(rt_se), rt_se, flags); + dequeue_rt_stack(rt_se, flags); for_each_sched_rt_entity(rt_se) __enqueue_rt_entity(rt_se, flags); @@ -1356,6 +1464,8 @@ static void dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags) { struct rq *rq = rq_of_rt_se(rt_se); + update_stats_dequeue_rt(rt_rq_of_se(rt_se), rt_se, flags); + dequeue_rt_stack(rt_se, flags); for_each_sched_rt_entity(rt_se) { @@ -1378,6 +1488,9 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags) if (flags & ENQUEUE_WAKEUP) rt_se->timeout = 0; + check_schedstat_required(); + update_stats_wait_start_rt(rt_rq_of_se(rt_se), rt_se); + enqueue_rt_entity(rt_se, flags); if (!task_current(rq, p) && p->nr_cpus_allowed > 1) @@ -1578,7 +1691,12 @@ static void check_preempt_curr_rt(struct rq *rq, struct task_struct *p, int flag static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first) { + struct sched_rt_entity *rt_se = &p->rt; + struct rt_rq *rt_rq = &rq->rt; + p->se.exec_start = rq_clock_task(rq); + if (on_rt_rq(&p->rt)) + update_stats_wait_end_rt(rt_rq, rt_se); /* The running task is never eligible for pushing */ dequeue_pushable_task(rq, p); @@ -1652,6 +1770,12 @@ static struct task_struct *pick_next_task_rt(struct rq *rq) static void put_prev_task_rt(struct rq *rq, struct task_struct *p) { + struct sched_rt_entity *rt_se = &p->rt; + struct rt_rq *rt_rq = &rq->rt; + + if (on_rt_rq(&p->rt)) + update_stats_wait_start_rt(rt_rq, rt_se); + update_curr_rt(rq); update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 1); From patchwork Sun Sep 5 14:35:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507422 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D025EC433F5 for ; Sun, 5 Sep 2021 14:36:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B993C60F3A for ; Sun, 5 Sep 2021 14:36:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237863AbhIEOhp (ORCPT ); Sun, 5 Sep 2021 10:37:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237059AbhIEOhh (ORCPT ); Sun, 5 Sep 2021 10:37:37 -0400 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A24EC061757; Sun, 5 Sep 2021 07:36:34 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id 7so3513643pfl.10; Sun, 05 Sep 2021 07:36:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/OVSykNEDsr3De09JjN4H8vteFEdeG7PRUERLsHEcls=; b=p0jJYPqB0C3XBLFbqKxZAc6H/gTRGF+sJXxxXG39ECHX5c957d9WWQqQIbEKqE0kMK qOqcj9oCr+2jf0+3LeT1tbLW62HN46V0mu6mg5+ERk6Bu/oO8mNcezJQY6OjZ4QYaVPk XJKHHpGWQfhZWumWVcYCGAslIwapQrg8RT7meSWOQvpazKe1kQ2CdyhOQfjccrAwGizz H+MOyFu4fzqycbrcaqHfshJkZaxTDXCcr3CzNEINzvDxdhyIUXLcZH6k0x6UcQruInXc zBCiIquXu9bWj1xaYcepBFkyQbNmZ0hKLm1buBMIrkGl23cZ8lbXu1VPW5Zc1e3fY5qv yMNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/OVSykNEDsr3De09JjN4H8vteFEdeG7PRUERLsHEcls=; b=MWAPlFweB93M0R7MVuZA0cfToNIzrTcxLP+38hwMxEMLe8G4K/qOTXI6bTSb1UJ8Yk W7SdbCL1cQmJc7RTk75fJr4ImlySCdoSwfNFiWZZFIk88GdXu7tLsxHuHT4RJ1vZe0Qj jARQJNGp+5Tz4baq36IjvXjoHxqqYvudiu0PTYRAdKwu3Er639QOdkZmEL6Q2AkWr7NV znUo9REnlrvVlYXRX4nMlr1vnRdau4oAqopzoSFm+f237NuxGZDEA2+k3POQIdrb392J +pvLCSsEuUzSv34rDXp3mKkBf9Gl+/haAYY50dVwjR5+4acDtPnU5vk03Xns/zGTQwcr dTqQ== X-Gm-Message-State: AOAM530cmbv6YnR+B5yE6Ww5Z9DtrwjbPfH7mGfJWW096/nGlTPTDUUZ 3hcIiDSp0yYWUxkJlsnKx44= X-Google-Smtp-Source: ABdhPJwg6qZHHmLsz1kJhudhMGEQgM6lez60oTX1WQeVV5qs4RHue5pn1xNQoop6M+3/EcvzI/kLtg== X-Received: by 2002:a63:5663:: with SMTP id g35mr7884843pgm.368.1630852593705; Sun, 05 Sep 2021 07:36:33 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:33 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao Subject: [PATCH v4 7/8] sched, dl: support sched_stat_runtime tracepoint for deadline sched class Date: Sun, 5 Sep 2021 14:35:46 +0000 Message-Id: <20210905143547.4668-8-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org The runtime of a DL task has already been there, so we only need to add a tracepoint. One difference between fair task and DL task is that there is no vruntime in dl task. To reuse the sched_stat_runtime tracepoint, '0' is passed as vruntime for DL task. The output of this tracepoint for DL task as follows, top-36462 [047] d.h. 6083.452103: sched_stat_runtime: comm=top pid=36462 runtime=409898 [ns] vruntime=0 [ns] Signed-off-by: Yafang Shao Cc: Mel Gorman Cc: Alison Chaiken --- kernel/sched/deadline.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 51dd30990042..73fb33e1868f 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1268,6 +1268,8 @@ static void update_curr_dl(struct rq *rq) schedstat_set(curr->stats.exec_max, max(curr->stats.exec_max, delta_exec)); + trace_sched_stat_runtime(curr, delta_exec, 0); + curr->se.sum_exec_runtime += delta_exec; account_group_exec_runtime(curr, delta_exec); From patchwork Sun Sep 5 14:35:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 507293 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 329E0C433FE for ; Sun, 5 Sep 2021 14:36:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1AD9560E8B for ; Sun, 5 Sep 2021 14:36:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237059AbhIEOht (ORCPT ); Sun, 5 Sep 2021 10:37:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235954AbhIEOhk (ORCPT ); Sun, 5 Sep 2021 10:37:40 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1EF3C0613CF; Sun, 5 Sep 2021 07:36:37 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id e7so4063018pgk.2; Sun, 05 Sep 2021 07:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gsvKcG3Ajg5DdCUJLouM7fhm4o6qAdupWRlaV5uJglw=; b=jMlpgPpJ2iIQi0uadeaXBS23IcVwqvSLB7aaRTzMny1ROAGHfrJTlMwP8PUvVT5Z6v Gxr+BhLfG0THJKYKddl8HjP2q6j4siKLen6hf8NZBPsS0MP3W9F2lKvmlWq/GVQpFSSz IkwiSZxFVhWgkVOm91XM+xMKnFXzTVNH/b5IoMvECBUgxMNL3WcQMqUzlh8eEvk2ynRg EbVWSfZzMWcx9O0PmcCfvpvlxj8IHuPWQ5xw6LSlSSHPxT++VzvvkdIhWDwpUVi8Vgja UIJbzpnzentSzKeCJuqMgGWOa5LpED0y/RJtsv+zqFXss62PagHJocTElJ2IJOWRQ3yI ijTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gsvKcG3Ajg5DdCUJLouM7fhm4o6qAdupWRlaV5uJglw=; b=EnaiuRoZnFPiZUiSN/DqtNJ6sFv3aUlv/JkJtDlpikpvKsWC01d87+AO3YxZqw4sjQ oA+GbL8ZcdoQMzk5RKDXtufLvbqhYhFpzMZNxCLRq5zH4QgZfWNHX8oeLWmP8qBpAaEq YvHkzPB3Br3Re7fa7YPTYJ/GDo1rADHAtfoQ7qLozIsBkeLSAuPAP4Xz5zwUo1iGCecG n2yz55SqaRE44/HYd3jaMRtUzFsclXD4/hnmZ1wz+h/4SEWUUNyqMZ3kivpC75Jjo/2K 3YzyoP+O0tk92u1TNtbSQ5hdATiuSOpstyMgcm3PankuHLonJi3JHBzM5PstMGY63ee1 jeng== X-Gm-Message-State: AOAM532keWN94LPd8C+efQW7ffLgt0rVlwHrJ5Jekk58lTxIV8MM6L5h /XXOGXHzDy3NX5vi9E2Mvkw= X-Google-Smtp-Source: ABdhPJz6NfVjVEmoKuCRQUw6AukcyG2nwZ4oILfk3/a5uBwppfrkYKyWJStVEEzGTxEvgzBVGqVEwg== X-Received: by 2002:a63:2243:: with SMTP id t3mr7747619pgm.114.1630852597499; Sun, 05 Sep 2021 07:36:37 -0700 (PDT) Received: from localhost.localdomain ([141.164.38.246]) by smtp.gmail.com with ESMTPSA id n185sm5186883pfn.171.2021.09.05.07.36.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Sep 2021 07:36:37 -0700 (PDT) From: Yafang Shao To: peterz@infradead.org, mingo@redhat.com, mgorman@suse.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, achaiken@aurora.tech, Yafang Shao Subject: [PATCH v4 8/8] sched, dl: support schedstats for deadline sched class Date: Sun, 5 Sep 2021 14:35:47 +0000 Message-Id: <20210905143547.4668-9-laoar.shao@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210905143547.4668-1-laoar.shao@gmail.com> References: <20210905143547.4668-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org After we make the struct sched_statistics and the helpers of it independent of fair sched class, we can easily use the schedstats facility for deadline sched class. The schedstat usage in DL sched class is similar with fair sched class, for example, fair deadline enqueue update_stats_enqueue_fair update_stats_enqueue_dl dequeue update_stats_dequeue_fair update_stats_dequeue_dl put_prev_task update_stats_wait_start update_stats_wait_start_dl set_next_task update_stats_wait_end update_stats_wait_end_dl The user can get the schedstats information in the same way in fair sched class. For example, fair deadline /proc/[pid]/sched /proc/[pid]/sched The output of a deadline task's schedstats as follows, $ cat /proc/69662/sched ... se.sum_exec_runtime : 3067.696449 se.nr_migrations : 0 sum_sleep_runtime : 720144.029661 sum_block_runtime : 0.547853 wait_start : 0.000000 sleep_start : 14131540.828955 block_start : 0.000000 sleep_max : 2999.974045 block_max : 0.283637 exec_max : 1.000269 slice_max : 0.000000 wait_max : 0.002217 wait_sum : 0.762179 wait_count : 733 iowait_sum : 0.547853 iowait_count : 3 nr_migrations_cold : 0 nr_failed_migrations_affine : 0 nr_failed_migrations_running : 0 nr_failed_migrations_hot : 0 nr_forced_migrations : 0 nr_wakeups : 246 nr_wakeups_sync : 2 nr_wakeups_migrate : 0 nr_wakeups_local : 244 nr_wakeups_remote : 2 nr_wakeups_affine : 0 nr_wakeups_affine_attempts : 0 nr_wakeups_passive : 0 nr_wakeups_idle : 0 ... The sched:sched_stat_{wait, sleep, iowait, blocked} tracepoints can be used to trace deadlline tasks as well. Signed-off-by: Yafang Shao Cc: Mel Gorman Cc: Alison Chaiken --- kernel/sched/deadline.c | 93 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 73fb33e1868f..d2c072b0ef01 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1474,6 +1474,82 @@ static inline bool __dl_less(struct rb_node *a, const struct rb_node *b) return dl_time_before(__node_2_dle(a)->deadline, __node_2_dle(b)->deadline); } +static inline struct sched_statistics * +__schedstats_from_dl_se(struct sched_dl_entity *dl_se) +{ + return &dl_task_of(dl_se)->stats; +} + +static inline void +update_stats_wait_start_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se) +{ + struct sched_statistics *stats; + + if (!schedstat_enabled()) + return; + + stats = __schedstats_from_dl_se(dl_se); + __update_stats_wait_start(rq_of_dl_rq(dl_rq), dl_task_of(dl_se), stats); +} + +static inline void +update_stats_wait_end_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se) +{ + struct sched_statistics *stats; + + if (!schedstat_enabled()) + return; + + stats = __schedstats_from_dl_se(dl_se); + __update_stats_wait_end(rq_of_dl_rq(dl_rq), dl_task_of(dl_se), stats); +} + +static inline void +update_stats_enqueue_sleeper_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se) +{ + struct sched_statistics *stats; + + if (!schedstat_enabled()) + return; + + stats = __schedstats_from_dl_se(dl_se); + __update_stats_enqueue_sleeper(rq_of_dl_rq(dl_rq), dl_task_of(dl_se), stats); +} + +static inline void +update_stats_enqueue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, + int flags) +{ + if (!schedstat_enabled()) + return; + + if (flags & ENQUEUE_WAKEUP) + update_stats_enqueue_sleeper_dl(dl_rq, dl_se); +} + +static inline void +update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, + int flags) +{ + struct task_struct *p = dl_task_of(dl_se); + + if (!schedstat_enabled()) + return; + + if ((flags & DEQUEUE_SLEEP)) { + unsigned int state; + + state = READ_ONCE(p->__state); + if (state & TASK_INTERRUPTIBLE) + __schedstat_set(p->stats.sleep_start, + rq_clock(rq_of_dl_rq(dl_rq))); + + if (state & TASK_UNINTERRUPTIBLE) + __schedstat_set(p->stats.block_start, + rq_clock(rq_of_dl_rq(dl_rq))); + } +} + static void __enqueue_dl_entity(struct sched_dl_entity *dl_se) { struct dl_rq *dl_rq = dl_rq_of_se(dl_se); @@ -1504,6 +1580,8 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se, int flags) { BUG_ON(on_dl_rq(dl_se)); + update_stats_enqueue_dl(dl_rq_of_se(dl_se), dl_se, flags); + /* * If this is a wakeup or a new instance, the scheduling * parameters of the task might need updating. Otherwise, @@ -1600,6 +1678,9 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags) return; } + check_schedstat_required(); + update_stats_wait_start_dl(dl_rq_of_se(&p->dl), &p->dl); + enqueue_dl_entity(&p->dl, flags); if (!task_current(rq, p) && p->nr_cpus_allowed > 1) @@ -1608,6 +1689,7 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags) static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags) { + update_stats_dequeue_dl(&rq->dl, &p->dl, flags); dequeue_dl_entity(&p->dl); dequeue_pushable_dl_task(rq, p); } @@ -1827,7 +1909,12 @@ static void start_hrtick_dl(struct rq *rq, struct task_struct *p) static void set_next_task_dl(struct rq *rq, struct task_struct *p, bool first) { + struct sched_dl_entity *dl_se = &p->dl; + struct dl_rq *dl_rq = &rq->dl; + p->se.exec_start = rq_clock_task(rq); + if (on_dl_rq(&p->dl)) + update_stats_wait_end_dl(dl_rq, dl_se); /* You can't push away the running task */ dequeue_pushable_dl_task(rq, p); @@ -1884,6 +1971,12 @@ static struct task_struct *pick_next_task_dl(struct rq *rq) static void put_prev_task_dl(struct rq *rq, struct task_struct *p) { + struct sched_dl_entity *dl_se = &p->dl; + struct dl_rq *dl_rq = &rq->dl; + + if (on_dl_rq(&p->dl)) + update_stats_wait_start_dl(dl_rq, dl_se); + update_curr_dl(rq); update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 1);