mbox series

[V8,00/14] RTLA: An interface for osnoise/timerlat tracers

Message ID cover.1638182284.git.bristot@kernel.org
Headers show
Series RTLA: An interface for osnoise/timerlat tracers | expand

Message

Daniel Bristot de Oliveira Nov. 29, 2021, 11:07 a.m. UTC
The rtla(1) is a meta-tool that includes a set of commands that
aims to analyze the real-time properties of Linux. But instead of
testing Linux as a black box, rtla leverages kernel tracing
capabilities to provide precise information about the properties
and root causes of unexpected results.

To start, it presents an interface to the osnoise and timerlat tracers.
In the future, it will also serve as home to the rtsl [1] and other
latency/noise tracers.

If you just want to run it, you can download the tarball here:
  - https://bristot.me/files/rtla/tarball/rtla-0.4.tar.bz2

To compile rtla on fedora you need:
  $ git clone git://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git
  $ cd libtraceevent/
  $ make
  $ sudo make install
  $ cd ..
  $ git clone git://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git
  $ cd libtracefs/
  $ make
  $ sudo make install
  $ cd ..
  $ sudo dnf install python3-docutils procps-devel
  $ cd $rtla_src
  $ make
  $ sudo make install

The tracing option (-t) depends some kernel patches that are
available at [2].

This tool was be discussed at the RT-MC during LPC2021 [3]

[1] rtsl: https://github.com/bristot/rtsl/
[2] https://lore.kernel.org/lkml/cover.1635533292.git.bristot@kernel.org/
[3] https://youtu.be/cZUzc0U1jJ4

Changes from V7:
  - Add README.txt with information about how to compile the tool
    (Steven)
  - Use $$($(PKG_CONFIG) --libs libtracefs) to find libtracefs on
    Makefile (Steven)
  - Fix *buffer[4096] (using buffer[4096]) on save_trace_to_file()
    (Steven)
Changes from v6:
  - Revisited osnoise option config functions
  - Properly handles offline CPUs
  - Some cleanups
  - Fixed an histogram allocation problem (Tao Zhou)
  - Revisited open()/read()/write() (Tao Zhou)
Changes from v5:
  - Fix retval check in save_trace_to_file() (Tao Zhou)
  - Fix goto logic in save_trace_to_file() (Tao Zhou)
  - Removed unused save_trace_pipe_to_file() function
  - Correctly handle an error on osnoise_set_* functions during
    "apply config" for all tools (Tao Zhou)
Changes from v3:
  - Add cross-compile support (Ahmed S. Darwish)
  - Move documentation to Documentation/tools/rtla (Jonathan Corbet)
  - Use .rst format for documentation (Jonathan Corbet)
  - Use include option from .rst to group common parts of the documentation
  - Makefile (main and doc) cleanups
Changes from v2:
  - Fix the miss conception of the "size" for kernel histograms (Steven/Tom)
  - Change the --skip-zeros to --with-zeros option as the former is better
    for humans (and the latter for computers to plot charts).
  - A lot of checkpatch fixes for the user-space part.
Changes from v1:
  - Fixes -t options on osnoise tracers (-t means --trace for all tools now)
  - Fixes --bucket-size references (not --bucket_size)

Daniel Bristot de Oliveira (14):
  rtla: Real-Time Linux Analysis tool
  rtla: Helper functions for rtla
  rtla: Add osnoise tool
  rtla/osnoise: Add osnoise top mode
  rtla/osnoise: Add the hist mode
  rtla: Add timerlat tool and timelart top mode
  rtla/timerlat: Add timerlat hist mode
  rtla: Add Documentation
  rtla: Add rtla osnoise man page
  rtla: Add rtla osnoise top documentation
  rtla: Add rtla osnoise hist documentation
  rtla: Add rtla timerlat documentation
  rtla: Add rtla timerlat top documentation
  rtla: Add rtla timerlat hist documentation

 Documentation/tools/rtla/Makefile             |   41 +
 Documentation/tools/rtla/common_appendix.rst  |   12 +
 .../tools/rtla/common_hist_options.rst        |   23 +
 Documentation/tools/rtla/common_options.rst   |   24 +
 .../tools/rtla/common_osnoise_description.rst |    8 +
 .../tools/rtla/common_osnoise_options.rst     |   17 +
 .../rtla/common_timerlat_description.rst      |   10 +
 .../tools/rtla/common_timerlat_options.rst    |   16 +
 .../tools/rtla/common_top_options.rst         |    3 +
 .../tools/rtla/rtla-osnoise-hist.rst          |   66 ++
 Documentation/tools/rtla/rtla-osnoise-top.rst |   61 +
 Documentation/tools/rtla/rtla-osnoise.rst     |   59 +
 .../tools/rtla/rtla-timerlat-hist.rst         |  106 ++
 .../tools/rtla/rtla-timerlat-top.rst          |  145 +++
 Documentation/tools/rtla/rtla-timerlat.rst    |   57 +
 Documentation/tools/rtla/rtla.rst             |   48 +
 tools/tracing/rtla/Makefile                   |  102 ++
 tools/tracing/rtla/README.txt                 |   36 +
 tools/tracing/rtla/src/osnoise.c              | 1017 +++++++++++++++++
 tools/tracing/rtla/src/osnoise.h              |   96 ++
 tools/tracing/rtla/src/osnoise_hist.c         |  799 +++++++++++++
 tools/tracing/rtla/src/osnoise_top.c          |  577 ++++++++++
 tools/tracing/rtla/src/rtla.c                 |   87 ++
 tools/tracing/rtla/src/timerlat.c             |   72 ++
 tools/tracing/rtla/src/timerlat.h             |    4 +
 tools/tracing/rtla/src/timerlat_hist.c        |  820 +++++++++++++
 tools/tracing/rtla/src/timerlat_top.c         |  615 ++++++++++
 tools/tracing/rtla/src/trace.c                |  192 ++++
 tools/tracing/rtla/src/trace.h                |   27 +
 tools/tracing/rtla/src/utils.c                |  433 +++++++
 tools/tracing/rtla/src/utils.h                |   56 +
 31 files changed, 5629 insertions(+)
 create mode 100644 Documentation/tools/rtla/Makefile
 create mode 100644 Documentation/tools/rtla/common_appendix.rst
 create mode 100644 Documentation/tools/rtla/common_hist_options.rst
 create mode 100644 Documentation/tools/rtla/common_options.rst
 create mode 100644 Documentation/tools/rtla/common_osnoise_description.rst
 create mode 100644 Documentation/tools/rtla/common_osnoise_options.rst
 create mode 100644 Documentation/tools/rtla/common_timerlat_description.rst
 create mode 100644 Documentation/tools/rtla/common_timerlat_options.rst
 create mode 100644 Documentation/tools/rtla/common_top_options.rst
 create mode 100644 Documentation/tools/rtla/rtla-osnoise-hist.rst
 create mode 100644 Documentation/tools/rtla/rtla-osnoise-top.rst
 create mode 100644 Documentation/tools/rtla/rtla-osnoise.rst
 create mode 100644 Documentation/tools/rtla/rtla-timerlat-hist.rst
 create mode 100644 Documentation/tools/rtla/rtla-timerlat-top.rst
 create mode 100644 Documentation/tools/rtla/rtla-timerlat.rst
 create mode 100644 Documentation/tools/rtla/rtla.rst
 create mode 100644 tools/tracing/rtla/Makefile
 create mode 100644 tools/tracing/rtla/README.txt
 create mode 100644 tools/tracing/rtla/src/osnoise.c
 create mode 100644 tools/tracing/rtla/src/osnoise.h
 create mode 100644 tools/tracing/rtla/src/osnoise_hist.c
 create mode 100644 tools/tracing/rtla/src/osnoise_top.c
 create mode 100644 tools/tracing/rtla/src/rtla.c
 create mode 100644 tools/tracing/rtla/src/timerlat.c
 create mode 100644 tools/tracing/rtla/src/timerlat.h
 create mode 100644 tools/tracing/rtla/src/timerlat_hist.c
 create mode 100644 tools/tracing/rtla/src/timerlat_top.c
 create mode 100644 tools/tracing/rtla/src/trace.c
 create mode 100644 tools/tracing/rtla/src/trace.h
 create mode 100644 tools/tracing/rtla/src/utils.c
 create mode 100644 tools/tracing/rtla/src/utils.h

Comments

Tao Zhou Nov. 30, 2021, 3:35 p.m. UTC | #1
Hi Daniel,

On Mon, Nov 29, 2021 at 12:07:41PM +0100, Daniel Bristot de Oliveira wrote:
> +/*
> + * osnoise_restore_cpus - restore the original "osnoise/cpus"
> + *
> + * osnoise_set_cpus() saves the original data for the "osnoise/cpus"

osnoise_set_cpus() --> osnoise_restore_cpus()

> + * file. This function restore the original config it was previously
> + * modified.
> + */
> +void osnoise_restore_cpus(struct osnoise_context *context)
> +{
> +	int retval;
> +
> +	if (!context->orig_cpus)
> +		return;
> +
> +	if (!context->curr_cpus)
> +		return;
> +
> +	/* nothing to do? */
> +	if (!strcmp(context->orig_cpus, context->curr_cpus))
> +		goto out_done;
> +
> +	retval = write(context->cpus_fd, context->orig_cpus, strlen(context->orig_cpus));

'strlen(context->orig_cpus) + 1' for write size;

> +	if (retval < strlen(context->orig_cpus))

Same here. Check 'strlen(context->orig_cpus) + 1'

> +		err_msg("could not restore original osnoise cpus\n");
> +
> +out_done:
> +	free(context->curr_cpus);
> +	context->curr_cpus = NULL;
> +}
> +
> +/*
> + * osnoise_get_runtime - return the original "osnoise/runtime_us" value
> + *
> + * It also saves the value to be restored.
> + */
> +unsigned long long osnoise_get_runtime(struct osnoise_context *context)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	long long runtime_us;
> +	char *runtime_path;
> +	int retval;
> +
> +	if (context->runtime_us != OSNOISE_TIME_INIT_VAL)
> +		return context->runtime_us;
> +
> +	if (context->orig_runtime_us != OSNOISE_TIME_INIT_VAL)
> +		return context->orig_runtime_us;
> +
> +	runtime_path = tracefs_get_tracing_file("osnoise/runtime_us");
> +
> +	context->runtime_fd = open(runtime_path, O_RDWR);
> +	if (context->runtime_fd < 0)
> +		goto out_err;
> +
> +	retval = read(context->runtime_fd, &buffer, sizeof(buffer));
> +	if (retval <= 0)
> +		goto out_close;
> +
> +	runtime_us = get_llong_from_str(buffer);
> +	if (runtime_us < 0)
> +		goto out_close;
> +
> +	tracefs_put_tracing_file(runtime_path);
> +
> +	context->orig_runtime_us = runtime_us;
> +	return runtime_us;
> +
> +out_close:
> +	close(context->runtime_fd);
> +	context->runtime_fd = CLOSED_FD;
> +out_err:
> +	tracefs_put_tracing_file(runtime_path);
> +	return 0;
> +}
> +/*
> + * osnoise_get_period - return the original "osnoise/period_us" value
> + *
> + * It also saves the value to be restored.
> + */
> +unsigned long long osnoise_get_period(struct osnoise_context *context)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	char *period_path;
> +	long long period_us;
> +	int retval;
> +
> +	if (context->period_us != OSNOISE_TIME_INIT_VAL)
> +		return context->period_us;
> +
> +	if (context->orig_period_us != OSNOISE_TIME_INIT_VAL)
> +		return context->orig_period_us;
> +
> +	period_path = tracefs_get_tracing_file("osnoise/period_us");
> +
> +	context->period_fd = open(period_path, O_RDWR);
> +	if (context->period_fd < 0)
> +		goto out_err;
> +
> +	retval = read(context->period_fd, &buffer, sizeof(buffer));
> +	if (retval <= 0)
> +		goto out_close;
> +
> +	period_us = get_llong_from_str(buffer);
> +	if (period_us < 0)
> +		goto out_close;
> +
> +	tracefs_put_tracing_file(period_path);
> +
> +	context->orig_period_us = period_us;
> +	return period_us;
> +
> +out_close:
> +	close(context->period_fd);
> +	context->period_fd = CLOSED_FD;
> +out_err:
> +	tracefs_put_tracing_file(period_path);
> +	return 0;
> +}

osnoise_get_period() and osnoise_get_runtime() almost the same.
Use macro to generate code. Some thing also not sure now. Shame


#define osnoise_get_period osnoise_get(period)
#define osnoise_get_runtime osnoise_get(runtime)

#define osnoise_get(x)	\
unsigned long long osnoise_get_##x(struct osnoise_context *context) \ 
{              \
	char buffer[BUFF_U64_STR_SIZE];             \
	char * x##_path;             \ 
	long long x##_us;            \
	if (context->x##_us != OSNOISE_TIME_INIT_VAL)                   \
		return context->x##_us;           \
	if (context->orig_##x##_us != OSNOISE_TIME_INIT_VAL)            \
		return context->orig_##x##_us;          \
	x##_path = tracefs_get_tracing_file("osnoise/x##_us");        \
	context->x##_fd = open(x##_path, O_RDWR);               \
	if (context->x##_fd < 0)                        \
		goto out_err;                 \
	retval = read(context->x##_fd, &buffer, sizeof(buffer));        \
	if (retval <= 0)                  \
		goto out_close;               \
	x##_us = get_llong_from_str(buffer);            \
	if (x##_us < 0)                   \
		goto out_close;               \
	tracefs_put_tracing_file(x##_path);             \
	context->orig_##x##_us = x##_us;                \
	return x##_us;                    \
out_close:                            \
	close(context->x##_fd);           \
	context->x##_fd = CLOSED_FD;                    \
out_err:                              \
	tracefs_put_tracing_file(x##_path);             \
	return 0;                         \
}


> +
> +static int __osnoise_write_runtime(struct osnoise_context *context,
> +				   unsigned long long runtime)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	int retval;
> +
> +	if (context->orig_runtime_us == OSNOISE_TIME_INIT_VAL)
> +		return -1;
> +
> +	snprintf(buffer, sizeof(buffer), "%llu\n", runtime);
> +
> +	retval = write(context->runtime_fd, buffer, strlen(buffer) + 1);
> +	if (retval < (strlen(buffer) + 1))
> +		return -1;
> +
> +	context->runtime_us = runtime;
> +	return 0;
> +}
> +
> +static int __osnoise_write_period(struct osnoise_context *context,
> +				  unsigned long long period)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	int retval;
> +
> +	if (context->orig_period_us == OSNOISE_TIME_INIT_VAL)
> +		return -1;
> +
> +	snprintf(buffer, sizeof(buffer), "%llu\n", period);
> +
> +	retval = write(context->period_fd, buffer, strlen(buffer) + 1);
> +	if (retval < (strlen(buffer) + 1))
> +		return -1;
> +
> +	context->period_us = period;
> +	return 0;
> +}

__osnoise_write_period() and __osnoise_write_runtime() share
almost the same code. Macro also use in here. Not sure it is
right. Shame.

#define __osnoise_write(x) \
static int __osnoise_write_##x(struct osnoise_context *context,    \
				  unsigned long long period)    \
{     \
	char buffer[BUFF_U64_STR_SIZE];     \
	int retval;     \
	if (context->orig_##x##_us == OSNOISE_TIME_INIT_VAL)   \
		return -1;    \
	snprintf(buffer, sizeof(buffer), "%llu\n", period);    \
	retval = write(context->x##_fd, buffer, strlen(buffer) + 1);   \
	if (retval < (strlen(buffer) + 1))    \
		return -1;    \
	context->x##_us = period;    \
	return 0;    \
}

#define __osnoise_write_period __osnoise_write(period)
#define __osnoise_write_runtime __osnoise_write(runtime)

> +
> +/*
> + * osnoise_set_runtime_period - set osnoise runtime and period
> + *
> + * Osnoise's runtime and period are related as runtime <= period.
> + * Thus, this function saves the original values, and then tries
> + * to set the runtime and period if they are != 0.
> + */
> +int osnoise_set_runtime_period(struct osnoise_context *context,
> +			       unsigned long long runtime,
> +			       unsigned long long period)
> +{
> +	unsigned long long curr_runtime_us;
> +	unsigned long long curr_period_us;
> +	int retval;
> +
> +	if (!period && !runtime)
> +		return 0;
> +
> +	curr_runtime_us = osnoise_get_runtime(context);
> +	curr_period_us = osnoise_get_period(context);
> +
> +	/* error getting any value? */
> +	if (curr_period_us == -1 || curr_runtime_us == -1)
> +		return -1;

'curr_period_us' and 'curr_runtime_us' error value should be
0(OSNOISE_TIME_INIT_VAL).

> +
> +	if (!period) {
> +		if (runtime > curr_period_us)
> +			return -1;
> +		return __osnoise_write_runtime(context, runtime);
> +	} else if (!runtime) {
> +		if (period < curr_runtime_us)
> +			return -1;
> +		return __osnoise_write_period(context, period);
> +	}
> +
> +	if (runtime > curr_period_us) {
> +		retval = __osnoise_write_period(context, period);
> +		if (retval)
> +			return -1;
> +		retval = __osnoise_write_runtime(context, runtime);
> +		if (retval)
> +			return -1;
> +	} else {
> +		retval = __osnoise_write_runtime(context, runtime);
> +		if (retval)
> +			return -1;
> +		retval = __osnoise_write_period(context, period);
> +		if (retval)
> +			return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * osnoise_get_timerlat_period_us - read and save the original "timerlat_period_us"
> + */
> +static long long
> +osnoise_get_timerlat_period_us(struct osnoise_context *context)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	long long timerlat_period_us;
> +	char *stop_path;
> +	int retval;
> +
> +	if (context->timerlat_period_us != OSNOISE_TIME_INIT_VAL)
> +		return context->timerlat_period_us;
> +
> +	if (context->orig_timerlat_period_us != OSNOISE_TIME_INIT_VAL)
> +		return context->orig_timerlat_period_us;
> +
> +	stop_path = tracefs_get_tracing_file("osnoise/timerlat_period_us");

Using timerlat_period_path seems to be straightforward.

> +
> +	context->timerlat_period_us_fd = open(stop_path, O_RDWR);
> +	if (context->timerlat_period_us_fd < 0)
> +		goto out_err;
> +
> +	retval = read(context->timerlat_period_us_fd, &buffer, sizeof(buffer));
> +	if (retval <= 0)
> +		goto out_close;
> +
> +	timerlat_period_us = get_llong_from_str(buffer);
> +	if (timerlat_period_us < 0)
> +		goto out_close;
> +
> +	tracefs_put_tracing_file(stop_path);
> +
> +	context->orig_timerlat_period_us = timerlat_period_us;
> +	return timerlat_period_us;
> +
> +out_close:
> +	close(context->timerlat_period_us_fd);
> +	context->timerlat_period_us_fd = CLOSED_FD;
> +out_err:
> +	tracefs_put_tracing_file(stop_path);
> +	return -1;

Should return 0 or OSNOISE_TIME_INIT_VAL and the return type
can be unsigned long long.

> +}
> +
> +/*
> + * osnoise_get_stop_us - read and save the original "stop_tracing_us"
> + */
> +static long long
> +osnoise_get_stop_us(struct osnoise_context *context)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	long long stop_us;
> +	char *stop_path;
> +	int retval;
> +
> +	if (context->stop_us != OSNOISE_OPTION_INIT_VAL)
> +		return context->stop_us;
> +
> +	if (context->orig_stop_us != OSNOISE_OPTION_INIT_VAL)
> +		return context->orig_stop_us;
> +
> +	stop_path = tracefs_get_tracing_file("osnoise/stop_tracing_us");
> +
> +	context->stop_us_fd = open(stop_path, O_RDWR);
> +	if (context->stop_us_fd < 0)
> +		goto out_err;
> +
> +	retval = read(context->stop_us_fd, &buffer, sizeof(buffer));
> +	if (retval <= 0)
> +		goto out_close;
> +
> +	stop_us = get_llong_from_str(buffer);
> +	if (stop_us < 0)
> +		goto out_close;
> +
> +	tracefs_put_tracing_file(stop_path);
> +
> +	context->orig_stop_us = stop_us;
> +	return stop_us;
> +
> +out_close:
> +	close(context->stop_us_fd);
> +	context->stop_us_fd = CLOSED_FD;
> +out_err:
> +	tracefs_put_tracing_file(stop_path);
> +	return -1;

The same.

> +}
> +
> +/*
> + * osnoise_get_stop_total_us - read and save the original "stop_tracing_total_us"
> + */
> +static long long
> +osnoise_get_stop_total_us(struct osnoise_context *context)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	long long stop_total_us;
> +	char *stop_path;

Use stop_total_path to differentiate with stop_path used in
osnoise_get_stop_us().

> +	int retval;
> +
> +	if (context->stop_total_us != OSNOISE_OPTION_INIT_VAL)
> +		return context->stop_total_us;
> +
> +	if (context->orig_stop_total_us != OSNOISE_OPTION_INIT_VAL)
> +		return context->orig_stop_total_us;
> +
> +	stop_path = tracefs_get_tracing_file("osnoise/stop_tracing_total_us");
> +
> +	context->stop_total_us_fd = open(stop_path, O_RDWR);
> +	if (context->stop_total_us_fd < 0)
> +		goto out_err;
> +
> +	retval = read(context->stop_total_us_fd, &buffer, sizeof(buffer));
> +	if (retval <= 0)
> +		goto out_close;
> +
> +	stop_total_us = get_llong_from_str(buffer);
> +	if (stop_total_us < 0)
> +		goto out_close;
> +
> +	tracefs_put_tracing_file(stop_path);
> +
> +	context->orig_stop_total_us = stop_total_us;
> +	return stop_total_us;
> +
> +out_close:
> +	close(context->stop_total_us_fd);
> +	context->stop_total_us_fd = CLOSED_FD;
> +out_err:
> +	tracefs_put_tracing_file(stop_path);
> +	return -1;

The same.

> +}
> +
> +/*
> + * osnoise_get_print_stack - read and save the original "print_stack"
> + */
> +static long long
> +osnoise_get_print_stack(struct osnoise_context *context)
> +{
> +	char buffer[BUFF_U64_STR_SIZE];
> +	long long print_stack;
> +	char *stop_path;

This is print_stack_path.

> +	int retval;
> +
> +	if (context->print_stack != OSNOISE_OPTION_INIT_VAL)
> +		return context->print_stack;
> +
> +	if (context->orig_print_stack != OSNOISE_OPTION_INIT_VAL)
> +		return context->orig_print_stack;
> +
> +	stop_path = tracefs_get_tracing_file("osnoise/print_stack");
> +
> +	context->print_stack_fd = open(stop_path, O_RDWR);
> +	if (context->print_stack_fd < 0)
> +		goto out_err;
> +
> +	retval = read(context->print_stack_fd, &buffer, sizeof(buffer));
> +	if (retval <= 0)
> +		goto out_close;
> +
> +	print_stack = get_llong_from_str(buffer);
> +	if (print_stack < 0)
> +		goto out_close;
> +
> +	tracefs_put_tracing_file(stop_path);
> +
> +	context->orig_print_stack = print_stack;
> +	return print_stack;
> +
> +out_close:
> +	close(context->print_stack_fd);
> +	context->print_stack_fd = CLOSED_FD;
> +out_err:
> +	tracefs_put_tracing_file(stop_path);
> +	return -1;

The same.

> +}
> +/*
> + * osnoise_context_alloc - alloc an osnoise_context
> + *
> + * The osnoise context contains the information of the "osnoise/" configs.
> + * It is used to set and restore the config.
> + */
> +struct osnoise_context *osnoise_context_alloc(void)
> +{
> +	struct osnoise_context *context;
> +
> +	context = calloc(1, sizeof(*context));
> +	if (!context)
> +		goto out_err;
> +
> +	context->cpus_fd 		= CLOSED_FD;
> +	context->runtime_fd		= CLOSED_FD;
> +	context->period_fd		= CLOSED_FD;
> +	context->stop_us_fd		= CLOSED_FD;
> +	context->stop_total_us_fd	= CLOSED_FD;
> +	context->timerlat_period_us_fd	= CLOSED_FD;
> +	context->print_stack_fd		= CLOSED_FD;
> +
> +	context->orig_stop_us		= OSNOISE_OPTION_INIT_VAL;
> +	context->stop_us		= OSNOISE_OPTION_INIT_VAL;
> +
> +	context->orig_stop_total_us	= OSNOISE_OPTION_INIT_VAL;
> +	context->stop_total_us		= OSNOISE_OPTION_INIT_VAL;
> +
> +	context->orig_print_stack	= OSNOISE_OPTION_INIT_VAL;
> +	context->print_stack		= OSNOISE_OPTION_INIT_VAL;
> +
> +	osnoise_get_context(context);
> +
> +	return context;
> +out_err:
> +	if (context)
> +		free(context);

context is NULL here, so no need the check. Just directly return NULL
when 'if(!context)' is enough.

> +	return NULL;
> +}

Sorry for my slow and not complete reply.. and leave not sure here.

Thanks,
Tao
Daniel Bristot de Oliveira Dec. 2, 2021, 3:18 p.m. UTC | #2
Hi Tao

On 11/30/21 16:35, Tao Zhou wrote:
> Hi Daniel,
> 
> On Mon, Nov 29, 2021 at 12:07:41PM +0100, Daniel Bristot de Oliveira wrote:
>> +/*
>> + * osnoise_restore_cpus - restore the original "osnoise/cpus"
>> + *
>> + * osnoise_set_cpus() saves the original data for the "osnoise/cpus"
> 
> osnoise_set_cpus() --> osnoise_restore_cpus()

This comment is correct... osnoise_set_cpus saves the data (in the tool),
osnoise_restore_cpus() restores it.

>> + * file. This function restore the original config it was previously
>> + * modified.
>> + */
>> +void osnoise_restore_cpus(struct osnoise_context *context)
>> +{
>> +	int retval;
>> +
>> +	if (!context->orig_cpus)
>> +		return;
>> +
>> +	if (!context->curr_cpus)
>> +		return;
>> +
>> +	/* nothing to do? */
>> +	if (!strcmp(context->orig_cpus, context->curr_cpus))
>> +		goto out_done;
>> +
>> +	retval = write(context->cpus_fd, context->orig_cpus, strlen(context->orig_cpus));
> 
> 'strlen(context->orig_cpus) + 1' for write size;
> 
>> +	if (retval < strlen(context->orig_cpus))
> 
> Same here. Check 'strlen(context->orig_cpus) + 1'

Fixed in v9.

>> +		err_msg("could not restore original osnoise cpus\n");
>> +
>> +out_done:
>> +	free(context->curr_cpus);
>> +	context->curr_cpus = NULL;
>> +}
>> +
>> +/*
>> + * osnoise_get_runtime - return the original "osnoise/runtime_us" value
>> + *
>> + * It also saves the value to be restored.
>> + */
>> +unsigned long long osnoise_get_runtime(struct osnoise_context *context)
>> +{
>> +	char buffer[BUFF_U64_STR_SIZE];
>> +	long long runtime_us;
>> +	char *runtime_path;
>> +	int retval;
>> +
>> +	if (context->runtime_us != OSNOISE_TIME_INIT_VAL)
>> +		return context->runtime_us;
>> +
>> +	if (context->orig_runtime_us != OSNOISE_TIME_INIT_VAL)
>> +		return context->orig_runtime_us;
>> +
>> +	runtime_path = tracefs_get_tracing_file("osnoise/runtime_us");
>> +
>> +	context->runtime_fd = open(runtime_path, O_RDWR);
>> +	if (context->runtime_fd < 0)
>> +		goto out_err;
>> +
>> +	retval = read(context->runtime_fd, &buffer, sizeof(buffer));
>> +	if (retval <= 0)
>> +		goto out_close;
>> +
>> +	runtime_us = get_llong_from_str(buffer);
>> +	if (runtime_us < 0)
>> +		goto out_close;
>> +
>> +	tracefs_put_tracing_file(runtime_path);
>> +
>> +	context->orig_runtime_us = runtime_us;
>> +	return runtime_us;
>> +
>> +out_close:
>> +	close(context->runtime_fd);
>> +	context->runtime_fd = CLOSED_FD;
>> +out_err:
>> +	tracefs_put_tracing_file(runtime_path);
>> +	return 0;
>> +}
>> +/*
>> + * osnoise_get_period - return the original "osnoise/period_us" value
>> + *
>> + * It also saves the value to be restored.
>> + */
>> +unsigned long long osnoise_get_period(struct osnoise_context *context)
>> +{
>> +	char buffer[BUFF_U64_STR_SIZE];
>> +	char *period_path;
>> +	long long period_us;
>> +	int retval;
>> +
>> +	if (context->period_us != OSNOISE_TIME_INIT_VAL)
>> +		return context->period_us;
>> +
>> +	if (context->orig_period_us != OSNOISE_TIME_INIT_VAL)
>> +		return context->orig_period_us;
>> +
>> +	period_path = tracefs_get_tracing_file("osnoise/period_us");
>> +
>> +	context->period_fd = open(period_path, O_RDWR);
>> +	if (context->period_fd < 0)
>> +		goto out_err;
>> +
>> +	retval = read(context->period_fd, &buffer, sizeof(buffer));
>> +	if (retval <= 0)
>> +		goto out_close;
>> +
>> +	period_us = get_llong_from_str(buffer);
>> +	if (period_us < 0)
>> +		goto out_close;
>> +
>> +	tracefs_put_tracing_file(period_path);
>> +
>> +	context->orig_period_us = period_us;
>> +	return period_us;
>> +
>> +out_close:
>> +	close(context->period_fd);
>> +	context->period_fd = CLOSED_FD;
>> +out_err:
>> +	tracefs_put_tracing_file(period_path);
>> +	return 0;
>> +}
> 
> osnoise_get_period() and osnoise_get_runtime() almost the same.
> Use macro to generate code. Some thing also not sure now. Shame
> 
> 
> #define osnoise_get_period osnoise_get(period)
> #define osnoise_get_runtime osnoise_get(runtime)
> 
> #define osnoise_get(x)	\
> unsigned long long osnoise_get_##x(struct osnoise_context *context) \ 
> {              \
> 	char buffer[BUFF_U64_STR_SIZE];             \
> 	char * x##_path;             \ 
> 	long long x##_us;            \
> 	if (context->x##_us != OSNOISE_TIME_INIT_VAL)                   \
> 		return context->x##_us;           \
> 	if (context->orig_##x##_us != OSNOISE_TIME_INIT_VAL)            \
> 		return context->orig_##x##_us;          \
> 	x##_path = tracefs_get_tracing_file("osnoise/x##_us");        \
> 	context->x##_fd = open(x##_path, O_RDWR);               \
> 	if (context->x##_fd < 0)                        \
> 		goto out_err;                 \
> 	retval = read(context->x##_fd, &buffer, sizeof(buffer));        \
> 	if (retval <= 0)                  \
> 		goto out_close;               \
> 	x##_us = get_llong_from_str(buffer);            \
> 	if (x##_us < 0)                   \
> 		goto out_close;               \
> 	tracefs_put_tracing_file(x##_path);             \
> 	context->orig_##x##_us = x##_us;                \
> 	return x##_us;                    \
> out_close:                            \
> 	close(context->x##_fd);           \
> 	context->x##_fd = CLOSED_FD;                    \
> out_err:                              \
> 	tracefs_put_tracing_file(x##_path);             \
> 	return 0;                         \
> }


I am not sure if it is worth to trade the readability for just two functions. I
will keep this as is foe now, and think about it in a second moment.

[...]
>> +
>> +/*
>> + * osnoise_set_runtime_period - set osnoise runtime and period
>> + *
>> + * Osnoise's runtime and period are related as runtime <= period.
>> + * Thus, this function saves the original values, and then tries
>> + * to set the runtime and period if they are != 0.
>> + */
>> +int osnoise_set_runtime_period(struct osnoise_context *context,
>> +			       unsigned long long runtime,
>> +			       unsigned long long period)
>> +{
>> +	unsigned long long curr_runtime_us;
>> +	unsigned long long curr_period_us;
>> +	int retval;
>> +
>> +	if (!period && !runtime)
>> +		return 0;
>> +
>> +	curr_runtime_us = osnoise_get_runtime(context);
>> +	curr_period_us = osnoise_get_period(context);
>> +
>> +	/* error getting any value? */
>> +	if (curr_period_us == -1 || curr_runtime_us == -1)
>> +		return -1;
> 
> 'curr_period_us' and 'curr_runtime_us' error value should be
> 0(OSNOISE_TIME_INIT_VAL).
> 


Right, I am now (in v9) returning the *_INIT_VAL on all errors, and using the
macro to check for errors.

[...]

>> +static long long
>> +osnoise_get_timerlat_period_us(struct osnoise_context *context)
>> +{
>> +	char buffer[BUFF_U64_STR_SIZE];
>> +	long long timerlat_period_us;
>> +	char *stop_path;
>> +	int retval;
>> +
>> +	if (context->timerlat_period_us != OSNOISE_TIME_INIT_VAL)
>> +		return context->timerlat_period_us;
>> +
>> +	if (context->orig_timerlat_period_us != OSNOISE_TIME_INIT_VAL)
>> +		return context->orig_timerlat_period_us;
>> +
>> +	stop_path = tracefs_get_tracing_file("osnoise/timerlat_period_us");
> 
> Using timerlat_period_path seems to be straightforward.
> 

I am using config_path for all variables like this.


[...]

>> +/*
>> + * osnoise_context_alloc - alloc an osnoise_context
>> + *
>> + * The osnoise context contains the information of the "osnoise/" configs.
>> + * It is used to set and restore the config.
>> + */
>> +struct osnoise_context *osnoise_context_alloc(void)
>> +{
>> +	struct osnoise_context *context;
>> +
>> +	context = calloc(1, sizeof(*context));
>> +	if (!context)
>> +		goto out_err;
>> +
>> +	context->cpus_fd 		= CLOSED_FD;
>> +	context->runtime_fd		= CLOSED_FD;
>> +	context->period_fd		= CLOSED_FD;
>> +	context->stop_us_fd		= CLOSED_FD;
>> +	context->stop_total_us_fd	= CLOSED_FD;
>> +	context->timerlat_period_us_fd	= CLOSED_FD;
>> +	context->print_stack_fd		= CLOSED_FD;
>> +
>> +	context->orig_stop_us		= OSNOISE_OPTION_INIT_VAL;
>> +	context->stop_us		= OSNOISE_OPTION_INIT_VAL;
>> +
>> +	context->orig_stop_total_us	= OSNOISE_OPTION_INIT_VAL;
>> +	context->stop_total_us		= OSNOISE_OPTION_INIT_VAL;
>> +
>> +	context->orig_print_stack	= OSNOISE_OPTION_INIT_VAL;
>> +	context->print_stack		= OSNOISE_OPTION_INIT_VAL;
>> +
>> +	osnoise_get_context(context);
>> +
>> +	return context;
>> +out_err:
>> +	if (context)
>> +		free(context);
> 
> context is NULL here, so no need the check. Just directly return NULL
> when 'if(!context)' is enough.
> 
>> +	return NULL;
>> +}

In v9, I am removing the goto, returning NULL if (!context).

> Sorry for my slow and not complete reply.. and leave not sure here.
> 
> Thanks,
> Tao

Thanks Tao
-- Daniel
Tao Zhou Dec. 4, 2021, 1:25 p.m. UTC | #3
On Mon, Nov 29, 2021 at 12:07:38PM +0100, Daniel Bristot de Oliveira wrote:

> The rtla(1) is a meta-tool that includes a set of commands that
> aims to analyze the real-time properties of Linux. But instead of
> testing Linux as a black box, rtla leverages kernel tracing
> capabilities to provide precise information about the properties
> and root causes of unexpected results.
> 
> To start, it presents an interface to the osnoise and timerlat tracers.
> In the future, it will also serve as home to the rtsl [1] and other
> latency/noise tracers.
> 
> If you just want to run it, you can download the tarball here:
>   - https://bristot.me/files/rtla/tarball/rtla-0.4.tar.bz2
> 
> To compile rtla on fedora you need:
>   $ git clone git://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git
>   $ cd libtraceevent/
>   $ make
>   $ sudo make install
>   $ cd ..
>   $ git clone git://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git
>   $ cd libtracefs/
>   $ make
>   $ sudo make install
>   $ cd ..
>   $ sudo dnf install python3-docutils procps-devel
>   $ cd $rtla_src
>   $ make
>   $ sudo make install

Set osnoise/x to DEADLINE, the return is not success. see:

tao@geo ~/opensource/rtla-0.4 $ sudo rtla osnoise top -P d:100us:1ms -c 0-3 -r 900000 -d 1M -q
boost_with_deadline failed to boost pid 4766: Operation not permitted
Failed to set sched parameters

top shows:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND         
 4773 root      20   0       0      0      0 R  89.4   0.0   2:06.88 osnoise/0       
 4776 root      20   0       0      0      0 R  89.4   0.0   2:06.88 osnoise/3       
 4774 root      20   0       0      0      0 R  89.0   0.0   2:06.88 osnoise/1       
 4775 root      20   0       0      0      0 R  88.7   0.0   2:06.69 osnoise/2       

 t
Daniel Bristot de Oliveira Dec. 4, 2021, 2:16 p.m. UTC | #4
On 12/4/21 14:25, Tao Zhou wrote:
> On Mon, Nov 29, 2021 at 12:07:38PM +0100, Daniel Bristot de Oliveira wrote:
> 
>> The rtla(1) is a meta-tool that includes a set of commands that
>> aims to analyze the real-time properties of Linux. But instead of
>> testing Linux as a black box, rtla leverages kernel tracing
>> capabilities to provide precise information about the properties
>> and root causes of unexpected results.
>>
>> To start, it presents an interface to the osnoise and timerlat tracers.
>> In the future, it will also serve as home to the rtsl [1] and other
>> latency/noise tracers.
>>
>> If you just want to run it, you can download the tarball here:
>>   - https://bristot.me/files/rtla/tarball/rtla-0.4.tar.bz2
>>
>> To compile rtla on fedora you need:
>>   $ git clone git://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git
>>   $ cd libtraceevent/
>>   $ make
>>   $ sudo make install
>>   $ cd ..
>>   $ git clone git://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git
>>   $ cd libtracefs/
>>   $ make
>>   $ sudo make install
>>   $ cd ..
>>   $ sudo dnf install python3-docutils procps-devel
>>   $ cd $rtla_src
>>   $ make
>>   $ sudo make install
> 
> Set osnoise/x to DEADLINE, the return is not success. see:
> 
> tao@geo ~/opensource/rtla-0.4 $ sudo rtla osnoise top -P d:100us:1ms -c 0-3 -r 900000 -d 1M -q
> boost_with_deadline failed to boost pid 4766: Operation not permitted
> Failed to set sched parameters

Did you disable the admission control?

e.g.,

  $ sudo sysctl -w kernel.sched_rt_runtime_us=-1

-- Daniel
Steven Rostedt Dec. 8, 2021, 10:14 p.m. UTC | #5
On Thu, 2 Dec 2021 16:18:53 +0100
Daniel Bristot de Oliveira <bristot@kernel.org> wrote:

> >> +	if (!strcmp(context->orig_cpus, context->curr_cpus))
> >> +		goto out_done;
> >> +
> >> +	retval = write(context->cpus_fd, context->orig_cpus, strlen(context->orig_cpus));  
> > 
> > 'strlen(context->orig_cpus) + 1' for write size;
> >   
> >> +	if (retval < strlen(context->orig_cpus))  
> > 
> > Same here. Check 'strlen(context->orig_cpus) + 1'  
> 
> Fixed in v9.

And if you used the tracefs_instance_file_write() function, you would not
have had his bug ;-)

-- Steve