mbox series

[0/6] selftests/resctrl: Support diverse platforms with MBM and MBA tests

Message ID cover.1724970211.git.reinette.chatre@intel.com
Headers show
Series selftests/resctrl: Support diverse platforms with MBM and MBA tests | expand

Message

Reinette Chatre Aug. 29, 2024, 10:52 p.m. UTC
The resctrl selftests for Memory Bandwidth Allocation (MBA) and Memory
Bandwidth Monitoring (MBM) are failing on some (for example [1]) Emerald
Rapids systems. The test failures result from the following two
properties of these systems:
1) Emerald Rapids systems can have up to 320MB L3 cache. The resctrl
   MBA and MBM selftests measure memory traffic for which a hardcoded
   250MB buffer has been sufficient so far. On platforms with L3 cache
   larger than the buffer, the buffer fits in the L3 cache and thus
   no/very little memory traffic is generated during the "memory
   bandwidth" tests.
2) Some platform features, for example RAS features or memory
   performance features that generate memory traffic may drive accesses
   that are counted differently by performance counters and MBM
   respectively, for instance generating "overhead" traffic which is not
   counted against any specific RMID. Until now these counting
   differences have always been "in the noise". On Emerald Rapids
   systems the maximum MBA throttling (10% memory bandwidth)
   throttles memory bandwidth to where memory accesses by these other
   platform features push the memory bandwidth difference between
   memory controller performance counters and resctrl (MBM) beyond the
   tests' hardcoded tolerance.

Make the tests more robust against platform variations:
1) Let the buffer used by memory bandwidth tests be guided by the size
   of the L3 cache.
2) Larger buffers require longer initialization time before the buffer can
   be used to measurement. Rework the tests to ensure that buffer
   initialization is complete before measurements start.
3) Do not compare performance counters and MBM measurements at low
   bandwidth. The value of "low" is hardcoded to 750MiB based on
   measurements on Emerald Rapids, Sapphire Rapids, and Ice Lake
   systems. This limit is not applicable to AMD systems since it
   only applies to the MBA and MBM tests that are isolated to Intel.

[1]
https://ark.intel.com/content/www/us/en/ark/products/237261/intel-xeon-platinum-8592-processor-320m-cache-1-9-ghz.html

Reinette Chatre (6):
  selftests/resctrl: Fix sparse warnings
  selftests/resctrl: Ensure measurements skip initialization of default
    benchmark
  selftests/resctrl: Simplify benchmark parameter passing
  selftests/resctrl: Use cache size to determine "fill_buf" buffer size
  selftests/resctrl: Do not compare performance counters and resctrl at
    low bandwidth
  selftests/resctrl: Keep results from first test run

 tools/testing/selftests/resctrl/cmt_test.c    |  33 +--
 tools/testing/selftests/resctrl/fill_buf.c    |  19 +-
 tools/testing/selftests/resctrl/mba_test.c    |  26 +-
 tools/testing/selftests/resctrl/mbm_test.c    |  25 +-
 tools/testing/selftests/resctrl/resctrl.h     |  57 +++--
 .../testing/selftests/resctrl/resctrl_tests.c |  15 +-
 tools/testing/selftests/resctrl/resctrl_val.c | 223 +++++-------------
 7 files changed, 152 insertions(+), 246 deletions(-)

Comments

Ilpo Järvinen Aug. 30, 2024, 10:29 a.m. UTC | #1
On Thu, 29 Aug 2024, Reinette Chatre wrote:

> Fix following sparse warnings:
>  tools/testing/selftests/resctrl/resctrl_val.c:47:6: warning: symbol 'membw_initialize_perf_event_attr' was not declared. Should it be static?
>  tools/testing/selftests/resctrl/resctrl_val.c:64:6: warning: symbol 'membw_ioctl_perf_event_ioc_reset_enable' was not declared. Should it be
> static?
>  tools/testing/selftests/resctrl/resctrl_val.c:70:6: warning: symbol 'membw_ioctl_perf_event_ioc_disable' was not declared. Should it be static?
>  tools/testing/selftests/resctrl/resctrl_val.c:81:6: warning: symbol 'get_event_and_umask' was not declared. Should it be static?
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

--
 i.
Ilpo Järvinen Aug. 30, 2024, 11:13 a.m. UTC | #2
On Thu, 29 Aug 2024, Reinette Chatre wrote:

> The benchmark used during the CMT, MBM, and MBA tests can be provided by
> the user via (-b) parameter to the tests, if not provided the default
> "fill_buf" benchmark is used.
> 
> The "fill_buf" benchmark requires parameters and these are managed as
> an array of strings.
> 
> Using an array of strings to manage the "fill_buf" parameters is
> complex because it requires transformations to/from strings at every
> producer and consumer. This is made worse for the individual tests
> where the default benchmark parameters values may not be appropriate and
> additional data wrangling is required. For example, the CMT test
> duplicates the entire array of strings in order to replace one of
> the parameters.
> 
> Replace the "array of strings" parameters used for "fill_buf" with a
> struct that contains the "fill_buf" parameters that can be used directly
> without transformations to/from strings. Make these parameters
> part of the parameters associated with each test so that each test can
> set benchmark parameters that are appropriate for it.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  tools/testing/selftests/resctrl/cmt_test.c    | 28 +++--------
>  tools/testing/selftests/resctrl/mba_test.c    |  7 ++-
>  tools/testing/selftests/resctrl/mbm_test.c    |  9 +++-
>  tools/testing/selftests/resctrl/resctrl.h     | 49 +++++++++++++------
>  .../testing/selftests/resctrl/resctrl_tests.c | 15 +-----
>  tools/testing/selftests/resctrl/resctrl_val.c | 38 +++++---------
>  6 files changed, 69 insertions(+), 77 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
> index 0c045080d808..f09d5dfab38c 100644
> --- a/tools/testing/selftests/resctrl/cmt_test.c
> +++ b/tools/testing/selftests/resctrl/cmt_test.c
> @@ -116,15 +116,12 @@ static void cmt_test_cleanup(void)
>  
>  static int cmt_run_test(const struct resctrl_test *test, const struct user_params *uparams)
>  {
> -	const char * const *cmd = uparams->benchmark_cmd;
> -	const char *new_cmd[BENCHMARK_ARGS];
>  	unsigned long cache_total_size = 0;
>  	int n = uparams->bits ? : 5;
>  	unsigned long long_mask;
> -	char *span_str = NULL;
>  	int count_of_bits;
>  	size_t span;
> -	int ret, i;
> +	int ret;
>  
>  	ret = get_full_cbm("L3", &long_mask);
>  	if (ret)
> @@ -155,32 +152,21 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
>  
>  	span = cache_portion_size(cache_total_size, param.mask, long_mask);
>  
> -	if (strcmp(cmd[0], "fill_buf") == 0) {
> -		/* Duplicate the command to be able to replace span in it */
> -		for (i = 0; uparams->benchmark_cmd[i]; i++)
> -			new_cmd[i] = uparams->benchmark_cmd[i];
> -		new_cmd[i] = NULL;
> -
> -		ret = asprintf(&span_str, "%zu", span);
> -		if (ret < 0)
> -			return -1;
> -		new_cmd[1] = span_str;
> -		cmd = new_cmd;
> -	}
> +	param.fill_buf.buf_size = span;
> +	param.fill_buf.memflush = 1;
> +	param.fill_buf.operation = 0;
> +	param.fill_buf.once = false;
>  
>  	remove(RESULT_FILE_NAME);
>  
> -	ret = resctrl_val(test, uparams, cmd, &param);
> +	ret = resctrl_val(test, uparams, &param);
>  	if (ret)
> -		goto out;
> +		return ret;
>  
>  	ret = check_results(&param, span, n);
>  	if (ret && (get_vendor() == ARCH_INTEL))
>  		ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
>  
> -out:
> -	free(span_str);
> -
>  	return ret;
>  }
>  
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index ab8496a4925b..8ad433495f61 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -174,7 +174,12 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
>  
>  	remove(RESULT_FILE_NAME);
>  
> -	ret = resctrl_val(test, uparams, uparams->benchmark_cmd, &param);
> +	param.fill_buf.buf_size = DEFAULT_SPAN;
> +	param.fill_buf.memflush = 1;
> +	param.fill_buf.operation = 0;
> +	param.fill_buf.once = false;
> +
> +	ret = resctrl_val(test, uparams, &param);
>  	if (ret)
>  		return ret;
>  
> diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
> index 6b5a3b52d861..b6883f274c74 100644
> --- a/tools/testing/selftests/resctrl/mbm_test.c
> +++ b/tools/testing/selftests/resctrl/mbm_test.c
> @@ -142,11 +142,16 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param
>  
>  	remove(RESULT_FILE_NAME);
>  
> -	ret = resctrl_val(test, uparams, uparams->benchmark_cmd, &param);
> +	param.fill_buf.buf_size = DEFAULT_SPAN;
> +	param.fill_buf.memflush = 1;
> +	param.fill_buf.operation = 0;
> +	param.fill_buf.once = false;
> +
> +	ret = resctrl_val(test, uparams, &param);
>  	if (ret)
>  		return ret;
>  
> -	ret = check_results(DEFAULT_SPAN);
> +	ret = check_results(param.fill_buf.buf_size);
>  	if (ret && (get_vendor() == ARCH_INTEL))
>  		ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
>  
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 0afbc4dd18e4..0e5456165a6a 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -79,6 +79,26 @@ struct resctrl_test {
>  	void		(*cleanup)(void);
>  };
>  
> +/*
> + * fill_buf_param:	"fill_buf" benchmark parameters
> + * @buf_size:		Size (in bytes) of buffer used in benchmark.
> + *			"fill_buf" allocates and initializes buffer of
> + *			@buf_size.
> + * @operation:		If 0, then only read operations are performed on
> + *			the buffer, if 1 then only write operations are
> + *			performed on the buffer.
> + * @memflush:		1 if buffer should be flushed after
> + *			allocation and initialization.
> + * @once:		Benchmark will perform @operation once if true,
> + *			infinitely (until terminated) if false.
> + */
> +struct fill_buf_param {
> +	size_t		buf_size;
> +	int		operation;
> +	int		memflush;
> +	int		once;
> +};
> +
>  /*
>   * resctrl_val_param:	resctrl test parameters
>   * @ctrlgrp:		Name of the control monitor group (con_mon grp)
> @@ -87,21 +107,23 @@ struct resctrl_test {
>   * @init:		Callback function to initialize test environment
>   * @setup:		Callback function to setup per test run environment
>   * @measure:		Callback that performs the measurement (a single test)
> + * @fill_buf:		Parameters for default "fill_buf" benchmark
>   */
>  struct resctrl_val_param {
> -	const char	*ctrlgrp;
> -	const char	*mongrp;
> -	char		filename[64];
> -	unsigned long	mask;
> -	int		num_of_runs;
> -	int		(*init)(const struct resctrl_val_param *param,
> -				int domain_id);
> -	int		(*setup)(const struct resctrl_test *test,
> -				 const struct user_params *uparams,
> -				 struct resctrl_val_param *param);
> -	int		(*measure)(const struct user_params *uparams,
> -				   struct resctrl_val_param *param,
> -				   pid_t bm_pid);
> +	const char		*ctrlgrp;
> +	const char		*mongrp;
> +	char			filename[64];
> +	unsigned long		mask;
> +	int			num_of_runs;
> +	int			(*init)(const struct resctrl_val_param *param,
> +					int domain_id);
> +	int			(*setup)(const struct resctrl_test *test,
> +					 const struct user_params *uparams,
> +					 struct resctrl_val_param *param);
> +	int			(*measure)(const struct user_params *uparams,
> +					   struct resctrl_val_param *param,
> +					   pid_t bm_pid);
> +	struct fill_buf_param	fill_buf;
>  };
>  
>  struct perf_event_read {
> @@ -151,7 +173,6 @@ void initialize_mem_bw_resctrl(const struct resctrl_val_param *param,
>  			       int domain_id);
>  int resctrl_val(const struct resctrl_test *test,
>  		const struct user_params *uparams,
> -		const char * const *benchmark_cmd,
>  		struct resctrl_val_param *param);
>  unsigned long create_bit_mask(unsigned int start, unsigned int len);
>  unsigned int count_contiguous_bits(unsigned long val, unsigned int *start);
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index ecbb7605a981..ce8fcc769d57 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -162,7 +162,7 @@ int main(int argc, char **argv)
>  	bool test_param_seen = false;
>  	struct user_params uparams;
>  	char *span_str = NULL;
> -	int ret, c, i;
> +	int c, i;
>  
>  	init_user_params(&uparams);
>  
> @@ -257,19 +257,6 @@ int main(int argc, char **argv)
>  
>  	filter_dmesg();
>  
> -	if (!uparams.benchmark_cmd[0]) {
> -		/* If no benchmark is given by "-b" argument, use fill_buf. */
> -		uparams.benchmark_cmd[0] = "fill_buf";
> -		ret = asprintf(&span_str, "%u", DEFAULT_SPAN);
> -		if (ret < 0)
> -			ksft_exit_fail_msg("Out of memory!\n");
> -		uparams.benchmark_cmd[1] = span_str;
> -		uparams.benchmark_cmd[2] = "1";
> -		uparams.benchmark_cmd[3] = "0";
> -		uparams.benchmark_cmd[4] = "false";
> -		uparams.benchmark_cmd[5] = NULL;
> -	}
> -
>  	ksft_set_plan(tests);
>  
>  	for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index 574b72604f95..9a5a9a67e059 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -612,21 +612,17 @@ int measure_mem_bw(const struct user_params *uparams,
>   *			the benchmark
>   * @test:		test information structure
>   * @uparams:		user supplied parameters
> - * @benchmark_cmd:	benchmark command and its arguments
>   * @param:		parameters passed to resctrl_val()
>   *
>   * Return:		0 when the test was run, < 0 on error.
>   */
>  int resctrl_val(const struct resctrl_test *test,
>  		const struct user_params *uparams,
> -		const char * const *benchmark_cmd,
>  		struct resctrl_val_param *param)
>  {
> -	int domain_id, operation = 0, memflush = 1;
> -	size_t span = DEFAULT_SPAN;
>  	unsigned char *buf = NULL;
>  	cpu_set_t old_affinity;
> -	bool once = false;
> +	int domain_id;
>  	int ret = 0;
>  	pid_t ppid;
>  
> @@ -666,21 +662,9 @@ int resctrl_val(const struct resctrl_test *test,
>  	 * how this impacts "write" benchmark, but no test currently
>  	 * uses this.
>  	 */
> -	if (strcmp(benchmark_cmd[0], "fill_buf") == 0) {
> -		span = strtoul(benchmark_cmd[1], NULL, 10);
> -		memflush =  atoi(benchmark_cmd[2]);
> -		operation = atoi(benchmark_cmd[3]);
> -		if (!strcmp(benchmark_cmd[4], "true")) {
> -			once = true;
> -		} else if (!strcmp(benchmark_cmd[4], "false")) {
> -			once = false;
> -		} else {
> -			ksft_print_msg("Invalid once parameter\n");
> -			ret = -EINVAL;
> -			goto reset_affinity;
> -		}
> -
> -		buf = alloc_buffer(span, memflush);
> +	if (!uparams->benchmark_cmd[0]) {
> +		buf = alloc_buffer(param->fill_buf.buf_size,
> +				   param->fill_buf.memflush);
>  		if (!buf) {
>  			ret = -ENOMEM;
>  			goto reset_affinity;
> @@ -699,13 +683,17 @@ int resctrl_val(const struct resctrl_test *test,
>  	 * terminated.
>  	 */
>  	if (bm_pid == 0) {
> -		if (strcmp(benchmark_cmd[0], "fill_buf") == 0) {
> -			if (operation == 0)
> -				fill_cache_read(buf, span, once);
> +		if (!uparams->benchmark_cmd[0]) {
> +			if (param->fill_buf.operation == 0)
> +				fill_cache_read(buf,
> +						param->fill_buf.buf_size,
> +						param->fill_buf.once);
>  			else
> -				fill_cache_write(buf, span, once);
> +				fill_cache_write(buf,
> +						 param->fill_buf.buf_size,
> +						 param->fill_buf.once);
>  		} else {
> -			execvp(benchmark_cmd[0], (char **)benchmark_cmd);
> +			execvp(uparams->benchmark_cmd[0], (char **)uparams->benchmark_cmd);
>  		}
>  		exit(EXIT_SUCCESS);
>  	}
> 

If I didn't miss anything important, this change takes away the ability to
alter fill_buf's parameters using -b option which to me felt the most 
useful way to use that parameter. The current code of course was lacks 
many safeguards for that case but still felt an useful feature.

I suggest that while parsing -b parameter, check if it starts with 
"fill_buf", and if it does, parse the argument into fill_buf_param in 
user_params which will override the default fill_buf parameters.

While parsing, adding new sanity checks wouldn't be a bad idea.

It might be some parameters might be better to be overridden always by the 
tests, e.g. "once" but specifying "operation" (W instead or R) or 
"buf_size" seems okay use cases to me.
Ilpo Järvinen Aug. 30, 2024, 11:25 a.m. UTC | #3
On Thu, 29 Aug 2024, Reinette Chatre wrote:

> By default the MBM and MBA tests use the "fill_buf" benchmark to
> read from a buffer with the goal to measure the memory bandwidth
> generated by this buffer access.
> 
> Care should be taken when sizing the buffer used by the "fill_buf"
> benchmark. If the buffer is small enough to fit in the cache then
> it cannot be expected that the benchmark will generate much memory
> bandwidth. For example, on a system with 320MB L3 cache the existing
> hardcoded default of 250MB is insufficient.
> 
> Use the measured cache size to determine a buffer size that can be
> expected to trigger memory access while keeping the existing default
> as minimum that has been appropriate for testing so far.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  tools/testing/selftests/resctrl/mba_test.c | 8 +++++++-
>  tools/testing/selftests/resctrl/mbm_test.c | 8 +++++++-
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index 8ad433495f61..cad473b81a64 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -170,11 +170,17 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
>  		.setup		= mba_setup,
>  		.measure	= mba_measure,
>  	};
> +	unsigned long cache_total_size = 0;
>  	int ret;
>  
>  	remove(RESULT_FILE_NAME);
>  
> -	param.fill_buf.buf_size = DEFAULT_SPAN;
> +	ret = get_cache_size(uparams->cpu, "L3", &cache_total_size);
> +	if (ret)
> +		return ret;
> +
> +	param.fill_buf.buf_size = cache_total_size > DEFAULT_SPAN ?
> +				  cache_total_size * 2 : DEFAULT_SPAN;

Should the check leave a bit of safeguard so that the buf_size is at 
least 2x (or x1.25 or some other factor)?

In here buf_size immediate jumps from 1x -> 2x when cache_total_size goes 
from DEFAULT_SPAN to DEFAULT_SPAN+1 (obviously L3 size won't be odd like 
that but I think you get my point).

Also, user might want to override this as mentioned in my reply to the 
previous patch.
Reinette Chatre Aug. 30, 2024, 4 p.m. UTC | #4
Hi Ilpo,

On 8/30/24 4:25 AM, Ilpo Järvinen wrote:
> On Thu, 29 Aug 2024, Reinette Chatre wrote:
> 
>> By default the MBM and MBA tests use the "fill_buf" benchmark to
>> read from a buffer with the goal to measure the memory bandwidth
>> generated by this buffer access.
>>
>> Care should be taken when sizing the buffer used by the "fill_buf"
>> benchmark. If the buffer is small enough to fit in the cache then
>> it cannot be expected that the benchmark will generate much memory
>> bandwidth. For example, on a system with 320MB L3 cache the existing
>> hardcoded default of 250MB is insufficient.
>>
>> Use the measured cache size to determine a buffer size that can be
>> expected to trigger memory access while keeping the existing default
>> as minimum that has been appropriate for testing so far.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>>   tools/testing/selftests/resctrl/mba_test.c | 8 +++++++-
>>   tools/testing/selftests/resctrl/mbm_test.c | 8 +++++++-
>>   2 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
>> index 8ad433495f61..cad473b81a64 100644
>> --- a/tools/testing/selftests/resctrl/mba_test.c
>> +++ b/tools/testing/selftests/resctrl/mba_test.c
>> @@ -170,11 +170,17 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
>>   		.setup		= mba_setup,
>>   		.measure	= mba_measure,
>>   	};
>> +	unsigned long cache_total_size = 0;
>>   	int ret;
>>   
>>   	remove(RESULT_FILE_NAME);
>>   
>> -	param.fill_buf.buf_size = DEFAULT_SPAN;
>> +	ret = get_cache_size(uparams->cpu, "L3", &cache_total_size);
>> +	if (ret)
>> +		return ret;
>> +
>> +	param.fill_buf.buf_size = cache_total_size > DEFAULT_SPAN ?
>> +				  cache_total_size * 2 : DEFAULT_SPAN;
> 
> Should the check leave a bit of safeguard so that the buf_size is at
> least 2x (or x1.25 or some other factor)?
> 
> In here buf_size immediate jumps from 1x -> 2x when cache_total_size goes
> from DEFAULT_SPAN to DEFAULT_SPAN+1 (obviously L3 size won't be odd like
> that but I think you get my point).

Good catch. Will fix.

> 
> Also, user might want to override this as mentioned in my reply to the
> previous patch.
> 

ack.

Reinette