diff mbox series

[03/12] trace-cmd analyze: Show how much tasks run on each CPU

Message ID 20220324025726.1727204-4-rostedt@goodmis.org
State New
Headers show
Series trace-cmd: Add trace-cmd analyze command | expand

Commit Message

Steven Rostedt March 24, 2022, 2:57 a.m. UTC
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Display for each CPU that was traced, the amount of time tasks ran on
them. Listing the tasks from the longest runner to the least.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 tracecmd/trace-analyze.c | 82 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

Comments

John Keeping March 25, 2022, 7:36 p.m. UTC | #1
On Wed, Mar 23, 2022 at 10:57:17PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> Display for each CPU that was traced, the amount of time tasks ran on
> them. Listing the tasks from the longest runner to the least.
> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> +	for (i = 0; i < nr_tasks; i++) {
> +		task = cpu_tasks[i]->task;
> +
> +		if (!i) {
> +			printf("    Task name        PID \t     Run time\n");
> +			printf("    ---------        --- \t     --------\n");
> +		}
> +		printf("%16s %8d\t",
> +		       tep_data_comm_from_pid(tep, task->pid),
> +		       task->pid);
> +		print_time(cpu_tasks[i]->runtime, '_');
> +		printf(" (%%%lld)\n", (task->runtime * 100) / total_time);

Is there a reason for using the CPU-specific runtime for the value and
the total runtime for the percentage?

I expected the percentage to be the percentage of this CPU's time spend
running the task.
Steven Rostedt March 25, 2022, 8:18 p.m. UTC | #2
On Fri, 25 Mar 2022 19:36:24 +0000
John Keeping <john@metanate.com> wrote:

> On Wed, Mar 23, 2022 at 10:57:17PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> > 
> > Display for each CPU that was traced, the amount of time tasks ran on
> > them. Listing the tasks from the longest runner to the least.
> > 
> > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > ---
> > +	for (i = 0; i < nr_tasks; i++) {
> > +		task = cpu_tasks[i]->task;
> > +
> > +		if (!i) {
> > +			printf("    Task name        PID \t     Run time\n");
> > +			printf("    ---------        --- \t     --------\n");
> > +		}
> > +		printf("%16s %8d\t",
> > +		       tep_data_comm_from_pid(tep, task->pid),
> > +		       task->pid);
> > +		print_time(cpu_tasks[i]->runtime, '_');
> > +		printf(" (%%%lld)\n", (task->runtime * 100) / total_time);  
> 
> Is there a reason for using the CPU-specific runtime for the value and
> the total runtime for the percentage?
> 
> I expected the percentage to be the percentage of this CPU's time spend
> running the task.

We modify it so that each CPU has the same run time, unless there's missed
events at the start (later patches), and then we change total_time to be
the total time of the events on the CPU and not the entire trace.

-- Steve
John Keeping March 26, 2022, 11:07 a.m. UTC | #3
On Fri, Mar 25, 2022 at 04:18:19PM -0400, Steven Rostedt wrote:
> On Fri, 25 Mar 2022 19:36:24 +0000
> John Keeping <john@metanate.com> wrote:
> 
> > On Wed, Mar 23, 2022 at 10:57:17PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> > > 
> > > Display for each CPU that was traced, the amount of time tasks ran on
> > > them. Listing the tasks from the longest runner to the least.
> > > 
> > > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > > ---
> > > +	for (i = 0; i < nr_tasks; i++) {
> > > +		task = cpu_tasks[i]->task;
> > > +
> > > +		if (!i) {
> > > +			printf("    Task name        PID \t     Run time\n");
> > > +			printf("    ---------        --- \t     --------\n");
> > > +		}
> > > +		printf("%16s %8d\t",
> > > +		       tep_data_comm_from_pid(tep, task->pid),
> > > +		       task->pid);
> > > +		print_time(cpu_tasks[i]->runtime, '_');
> > > +		printf(" (%%%lld)\n", (task->runtime * 100) / total_time);  
> > 
> > Is there a reason for using the CPU-specific runtime for the value and
> > the total runtime for the percentage?
> > 
> > I expected the percentage to be the percentage of this CPU's time spend
> > running the task.
> 
> We modify it so that each CPU has the same run time, unless there's missed
> events at the start (later patches), and then we change total_time to be
> the total time of the events on the CPU and not the entire trace.

I think we're talking about different things here, I probably wasn't
entirely clear about this.  It's the numerator of this division that I'm
concerned about and I wonder if this should be:

	(cpu_tasks[i]->runtime * 100) / total_time

If total_time is 10 seconds and there's a task A which runs on CPU0 for
2 seconds and CPU1 for 1 second then I expect to see:

	CPU 0
	A	2.000  (20%)

	CPU 1
	A	1.000  (10%)

But at the moment both of those lines will give 30% (although the actual
run times are correct).
Steven Rostedt March 26, 2022, 3:16 p.m. UTC | #4
On Sat, 26 Mar 2022 11:07:03 +0000
John Keeping <john@metanate.com> wrote:

> > > > +		print_time(cpu_tasks[i]->runtime, '_');
> > > > +		printf(" (%%%lld)\n", (task->runtime * 100) / total_time);    
> > > 
> > > Is there a reason for using the CPU-specific runtime for the value and
> > > the total runtime for the percentage?
> > > 
> > > I expected the percentage to be the percentage of this CPU's time spend
> > > running the task.  
> > 
> > We modify it so that each CPU has the same run time, unless there's missed
> > events at the start (later patches), and then we change total_time to be
> > the total time of the events on the CPU and not the entire trace.  
> 
> I think we're talking about different things here, I probably wasn't
> entirely clear about this.  It's the numerator of this division that I'm
> concerned about and I wonder if this should be:
> 
> 	(cpu_tasks[i]->runtime * 100) / total_time

Ah, I did misunderstand you. Yes, that's a typo. Thanks for pointing
that out. I'll go fix it.

-- Steve
Kalesh Singh March 28, 2022, 4:23 p.m. UTC | #5
On Sat, Mar 26, 2022 at 8:16 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Sat, 26 Mar 2022 11:07:03 +0000
> John Keeping <john@metanate.com> wrote:
>
> > > > > +               print_time(cpu_tasks[i]->runtime, '_');
> > > > > +               printf(" (%%%lld)\n", (task->runtime * 100) / total_time);
> > > >
> > > > Is there a reason for using the CPU-specific runtime for the value and
> > > > the total runtime for the percentage?
> > > >
> > > > I expected the percentage to be the percentage of this CPU's time spend
> > > > running the task.
> > >
> > > We modify it so that each CPU has the same run time, unless there's missed
> > > events at the start (later patches), and then we change total_time to be
> > > the total time of the events on the CPU and not the entire trace.
> >
> > I think we're talking about different things here, I probably wasn't
> > entirely clear about this.  It's the numerator of this division that I'm
> > concerned about and I wonder if this should be:
> >
> >       (cpu_tasks[i]->runtime * 100) / total_time
>
> Ah, I did misunderstand you. Yes, that's a typo. Thanks for pointing
> that out. I'll go fix it.

> > > > > +       if (idle_task) {
> > > > > +               printf("idle:\t");
> > > > > +               print_time(idle_task->runtime, '_');
> > > > > +               printf(" (%%%lld)\n", (idle_task->runtime * 100) / total_time);

IIUC the idle_task->runtime works out the same as the per-cpu runtime.
But for consistency I think we should use idle_cpu_task->runtime here
as well

Thanks,
Kalesh

>
> -- Steve
>
>
diff mbox series

Patch

diff --git a/tracecmd/trace-analyze.c b/tracecmd/trace-analyze.c
index 2211079668e4..43e9ffa95687 100644
--- a/tracecmd/trace-analyze.c
+++ b/tracecmd/trace-analyze.c
@@ -44,6 +44,7 @@  struct task_item {
 };
 
 struct task_cpu_item {
+	unsigned long long	runtime;
 	struct trace_hash_item	hash;
 	struct task_item       	*task;
 };
@@ -171,6 +172,7 @@  static void update_cpu_task_times(struct cpu_data *cpu_data,
 
 	delta = ts - task->start_ts;
 	task->runtime += delta;
+	cpu_task->runtime += delta;
 }
 
 static void update_pid(struct cpu_data *cpu_data,
@@ -198,6 +200,7 @@  static void update_pid(struct cpu_data *cpu_data,
 	} else {
 		delta = record->ts - cpu_data->last_ts;
 		task->runtime += delta;
+		cpu_task->runtime += delta;
 	}
 
 	cpu_data->last_ts = record->ts;
@@ -246,6 +249,7 @@  static void update_first_pid(struct cpu_data *cpu_data)
 	task = cpu_task->task;
 	delta = cpu_data->start_ts - start_ts;
 	task->runtime += delta;
+	cpu_task->runtime += delta;
 	cpu_data->start_ts = start_ts;
 }
 
@@ -275,6 +279,17 @@  static int cmp_tasks(const void *A, const void *B)
 	return (*a)->runtime < (*b)->runtime;
 }
 
+static int cmp_cpu_tasks(const void *A, const void *B)
+{
+	struct task_cpu_item * const *a = A;
+	struct task_cpu_item * const *b = B;
+
+	if ((*a)->runtime > (*b)->runtime)
+		return -1;
+
+	return (*a)->runtime < (*b)->runtime;
+}
+
 static void print_time(unsigned long long ts, char delim)
 {
 	unsigned long long secs;
@@ -306,6 +321,67 @@  static void print_time(unsigned long long ts, char delim)
 	}
 }
 
+static void print_cpu_data(struct tep_handle *tep, struct cpu_data *cpu_data)
+{
+	unsigned long long total_time;
+	struct trace_hash_item **bucket;
+	struct trace_hash_item *item;
+	struct task_cpu_item **cpu_tasks;
+	struct task_cpu_item *cpu_task;
+	struct task_item *idle_task = NULL;
+	struct task_item *task;
+	struct analysis_data *data;
+	int nr_tasks;
+	int i = 0;
+
+	data = cpu_data->data;
+	total_time = data->last_ts - data->start_ts;
+
+	printf("\nCPU %d\n", cpu_data->cpu);
+	printf("-------\n");
+
+	cpu_tasks = malloc(sizeof(*cpu_tasks) * cpu_data->nr_tasks);
+
+	if (!cpu_tasks)
+		die("Could not allocate task array");
+
+	trace_hash_for_each_bucket(bucket, &cpu_data->tasks) {
+		trace_hash_for_each_item(item, bucket) {
+			cpu_task = task_cpu_from_hash(item);
+			if (cpu_task->task->pid <= 0)
+				idle_task = cpu_task->task;
+			else
+				cpu_tasks[i++] = cpu_task;
+		}
+	}
+	nr_tasks = i;
+
+	if (idle_task) {
+		printf("idle:\t");
+		print_time(idle_task->runtime, '_');
+		printf(" (%%%lld)\n", (idle_task->runtime * 100) / total_time);
+	} else {
+		printf("Never idle!\n");
+	}
+
+	qsort(cpu_tasks, nr_tasks, sizeof(*cpu_tasks), cmp_cpu_tasks);
+
+	for (i = 0; i < nr_tasks; i++) {
+		task = cpu_tasks[i]->task;
+
+		if (!i) {
+			printf("    Task name        PID \t     Run time\n");
+			printf("    ---------        --- \t     --------\n");
+		}
+		printf("%16s %8d\t",
+		       tep_data_comm_from_pid(tep, task->pid),
+		       task->pid);
+		print_time(cpu_tasks[i]->runtime, '_');
+		printf(" (%%%lld)\n", (task->runtime * 100) / total_time);
+	}
+	free(cpu_tasks);
+}
+
 static void print_total(struct tep_handle *tep, struct analysis_data *data)
 {
 	unsigned long long total_time;
@@ -376,6 +452,12 @@  static void print_total(struct tep_handle *tep, struct analysis_data *data)
 	free(tasks);
 
 	printf("\n");
+
+	for (i = 0; i < data->allocated_cpus; i++) {
+		if (!data->cpu_data[i].data)
+			continue;
+		print_cpu_data(tep, &data->cpu_data[i]);
+	}
 }
 
 static void free_tasks(struct trace_hash *hash)