mbox series

[v4,00/29] selftests/resctrl: CAT test improvements & generalized test framework

Message ID 20231215150515.36983-1-ilpo.jarvinen@linux.intel.com
Headers show
Series selftests/resctrl: CAT test improvements & generalized test framework | expand

Message

Ilpo Järvinen Dec. 15, 2023, 3:04 p.m. UTC
Hi all,

Here's v4 series to improve resctrl selftests with generalized test
framework and rewritten CAT test.

The series contains following improvements:

- Excludes shareable bits from CAT test allocation to avoid interference
- Replaces file "sink" with a volatile variable
- Alters read pattern to defeat HW prefetcher optimizations
- Rewrites CAT test to make the CAT test reliable and truly measure
  if CAT is working or not
- Introduces generalized test framework making easier to add new tests
- Lots of other cleanups & refactoring

This series has been tested across a large number of systems from
different generations.

v4:
- Reworded a few error prints
- Changelog improvements
- fprintf()'s error handling changed ksft_perror() -> ksft_print_msg()
- Keep using ksft_*() instead of fprintf() in get_bit_mask()
- Check against div-by-zero
- Adjust one return type

v3:
- New patches to handle return errno, perror() and return value comments
- Tweak changelogs
- Moved error printout removal to other patch
- Zero bit CBM returns error
- Tweak comments
- Make get_shareable_mask() static
- Return directly without storing result into ret variable first
- llc -> LLC
- Altered changelog and removed "the whole time" wording because
  llc occu results are still unsigned long
- Altered changelog's wording to not say "a volatile pointer"
- Make min_diff_percent and MIN_DIFF_PERCENT_PER_BIT unsigned long
- Add patch to restore CPU affinity after CAT test
- Move uparams clear into init function
- Add CPU vendor ID bitmask comment
- Use test_resource_feature_check(test) in CMT
- "feature" -> "resource" in function comment

v2:
- Postpone adding L2 CAT test as more investigations are necessary
- Add patch to remove ctrlc_handler() from wrong place
- Improvements to changelogs
- Function comments improvements & comment cleanups
- Move some parts of the changes into more logical patch
- If checks: buf == NULL -> !buf
- Variable naming:
        - p -> buf
        - cbm_mask_path -> cbm_path
- Function naming:
        - get_cbm_mask() -> get_full_cbm()
        - cache_size() -> cache_portion_size()
- Use PATH_MAX
- Improved cache_portion_size() parameter names
- int count -> unsigned int
- Pass filename to measurement taking functions instead of
  resctrl_val_param
- !lines ? : reversal
- Removed bogus static from function local variable
- Open perf fd only once, reset & enable in the innermost test loop
- Add perf fd ioctl() error handling
- Add patch to change compiler optimization prevention "sink" from file
  to volatile variable
- Remove cpu_no and resource (the latter was added in v1) members from
  resctrl_val_param (pass uparams and test where those are needed)
- Removed ARRAY_SIZE() macro
- Add patch to rename "resource_id" to "domain_id"


Ilpo Järvinen (29):
  selftests/resctrl: Convert perror() to ksft_perror() or
    ksft_print_msg()
  selftests/resctrl: Return -1 instead of errno on error
  selftests/resctrl: Don't use ctrlc_handler() outside signal handling
  selftests/resctrl: Change function comments to say < 0 on error
  selftests/resctrl: Split fill_buf to allow tests finer-grained control
  selftests/resctrl: Refactor fill_buf functions
  selftests/resctrl: Refactor get_cbm_mask() and rename to
    get_full_cbm()
  selftests/resctrl: Mark get_cache_size() cache_type const
  selftests/resctrl: Create cache_portion_size() helper
  selftests/resctrl: Exclude shareable bits from schemata in CAT test
  selftests/resctrl: Split measure_cache_vals()
  selftests/resctrl: Split show_cache_info() to test specific and
    generic parts
  selftests/resctrl: Remove unnecessary __u64 -> unsigned long
    conversion
  selftests/resctrl: Remove nested calls in perf event handling
  selftests/resctrl: Consolidate naming of perf event related things
  selftests/resctrl: Improve perf init
  selftests/resctrl: Convert perf related globals to locals
  selftests/resctrl: Move cat_val() to cat_test.c and rename to
    cat_test()
  selftests/resctrl: Open perf fd before start & add error handling
  selftests/resctrl: Replace file write with volatile variable
  selftests/resctrl: Read in less obvious order to defeat prefetch
    optimizations
  selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test
  selftests/resctrl: Restore the CPU affinity after CAT test
  selftests/resctrl: Create struct for input parameters
  selftests/resctrl: Introduce generalized test framework
  selftests/resctrl: Pass write_schemata() resource instead of test name
  selftests/resctrl: Add helper to convert L2/3 to integer
  selftests/resctrl: Rename resource ID to domain ID
  selftests/resctrl: Get domain id from cache id

 tools/testing/selftests/resctrl/cache.c       | 287 +++++----------
 tools/testing/selftests/resctrl/cat_test.c    | 337 +++++++++++-------
 tools/testing/selftests/resctrl/cmt_test.c    |  80 +++--
 tools/testing/selftests/resctrl/fill_buf.c    | 132 ++++---
 tools/testing/selftests/resctrl/mba_test.c    |  30 +-
 tools/testing/selftests/resctrl/mbm_test.c    |  32 +-
 tools/testing/selftests/resctrl/resctrl.h     | 135 +++++--
 .../testing/selftests/resctrl/resctrl_tests.c | 197 ++++------
 tools/testing/selftests/resctrl/resctrl_val.c | 138 +++----
 tools/testing/selftests/resctrl/resctrlfs.c   | 321 +++++++++++------
 10 files changed, 945 insertions(+), 744 deletions(-)

Comments

Reinette Chatre Dec. 15, 2023, 5:45 p.m. UTC | #1
Hi Ilpo and Shuah,

On 12/15/2023 7:04 AM, Ilpo Järvinen wrote:
> Here's v4 series to improve resctrl selftests with generalized test
> framework and rewritten CAT test.
> 
> The series contains following improvements:
> 
> - Excludes shareable bits from CAT test allocation to avoid interference
> - Replaces file "sink" with a volatile variable
> - Alters read pattern to defeat HW prefetcher optimizations
> - Rewrites CAT test to make the CAT test reliable and truly measure
>   if CAT is working or not
> - Introduces generalized test framework making easier to add new tests
> - Lots of other cleanups & refactoring
> 
> This series has been tested across a large number of systems from
> different generations.

Ilpo, thank you very much for this great cleanup and a creating a
reliable CAT test. This work is focused on kernel health and greatly
appreciated.

All patches in this series should have my reviewed-by tag. For
confirmation, for this whole series:
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Shuah, could you please consider this series for inclusion at
your convenience?

Thank you very much.

Reinette
Reinette Chatre Dec. 15, 2023, 11:45 p.m. UTC | #2
On 12/15/2023 9:45 AM, Reinette Chatre wrote:
> Hi Ilpo and Shuah,
> 
> On 12/15/2023 7:04 AM, Ilpo Järvinen wrote:
>> Here's v4 series to improve resctrl selftests with generalized test
>> framework and rewritten CAT test.
>>
>> The series contains following improvements:
>>
>> - Excludes shareable bits from CAT test allocation to avoid interference
>> - Replaces file "sink" with a volatile variable
>> - Alters read pattern to defeat HW prefetcher optimizations
>> - Rewrites CAT test to make the CAT test reliable and truly measure
>>   if CAT is working or not
>> - Introduces generalized test framework making easier to add new tests
>> - Lots of other cleanups & refactoring
>>
>> This series has been tested across a large number of systems from
>> different generations.
> 
> Ilpo, thank you very much for this great cleanup and a creating a
> reliable CAT test. This work is focused on kernel health and greatly
> appreciated.
> 
> All patches in this series should have my reviewed-by tag. For
> confirmation, for this whole series:
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> 
> Shuah, could you please consider this series for inclusion at
> your convenience?

Just in case somebody tries this series out against kernel v6.7-rc5 ...

A problematic perf patch made it into v6.7-rc5 (v6.7-rc4 and before are fine).
When testing this series against kernel v6.7-rc5 the splat reported at [1]
is triggered. A perf fix [2] is already queued up so all will be fine when testing
this series against a kernel with [2] merged.

Reinette


[1] https://lore.kernel.org/lkml/20231214000620.3081018-1-lucas.demarchi@intel.com/
[2] https://lore.kernel.org/lkml/20231215112450.3972309-1-mark.rutland@arm.com/