mbox series

[v1,0/9] selftests/mm fixes for arm64

Message ID 20230713135440.3651409-1-ryan.roberts@arm.com
Headers show
Series selftests/mm fixes for arm64 | expand

Message

Ryan Roberts July 13, 2023, 1:54 p.m. UTC
Hi All,

Given my on-going work on large anon folios and contpte mappings, I decided it
would be a good idea to start running mm selftests to help guard against
regressions. However, it soon became clear that I couldn't get the suite to run
cleanly on arm64 with a vanilla v6.5-rc1 kernel (perhaps I'm just doing it
wrong??), so got stuck in a rabbit hole trying to debug and fix all the issues.
Some were down to misconfigurations, but I also found a number of issues with
the tests and even a couple of issues with the kernel.

This series aims to fix (most of) the test issues. It applies on top of
v6.5-rc1.


Reproducing
-----------

What follows is a write up of how I'm running the tests and the results I see
with this series applied. I don't yet have a concrete understanding of all of
the remaining failures. So if anyone has any comments on my setup or reasons for
the test failures it would be great to hear.

Source: v6.5-rc1 + this series + [1] + [2]. [1] is a patch from Florent Revest to
fix mdwe mmap_FIXED tests. [2] is a fix for a regression in the kernel that I
found by running `mlock-random-test` and `mlock2-tests`.

Compile the kernel (on arm64 system):

$ make defconfig
$ ./scripts/config --enable CONFIG_SQUASHFS_LZ4
$ ./scripts/config --enable CONFIG_SQUASHFS_LZO
$ ./scripts/config --enable CONFIG_SQUASHFS_XZ
$ ./scripts/config --enable CONFIG_SQUASHFS_ZSTD
$ ./scripts/config --enable CONFIG_XFS_FS
$ ./scripts/config --enable CONFIG_SYSVIPC
$ ./scripts/config --enable CONFIG_USERFAULTFD
$ ./scripts/config --enable CONFIG_TEST_VMALLOC
$ ./scripts/config --enable CONFIG_GUP_TEST
$ ./scripts/config --enable CONFIG_TRANSPARENT_HUGEPAGE
$ ./scripts/config --enable CONFIG_MEM_SOFT_DIRTY
$ make olddefconfig
$ make -s -j`nproc` Image

(In the above case, I'm building/testing a 4K kernel).

Note that it turns out that arm64 doesn't really support ZONE_DEVICE; Although
it defines ARCH_HAS_PTE_DEVMAP, it can't allocate `struct page`s for arbitrary
physical addresses. This means that the TEST_HMM module causes warnings to be
emitted when initializing because it tries to reserve arbitrary PA range then
requests struct page's for them. I haven't fully investigated this yet, but for
now, I'm just deliverately excluding ZONE_DEVICE, (which TEST_HMM depends upon).
This means that the `hmm-tests` selftest gets skipped at runtime.

Compile the tests:

$ make -j`nproc` headers_install
$ make -C tools/testing/selftests TARGETS=mm install INSTALL_PATH=<path/to/install>

Start a VM running the kernel we just compiled:

$ taskset -c 8-15 qemu-system-aarch64							\
	-object memory-backend-file,id=mem0,size=6G,mem-path=/hugetlbfs,merge=off,prealloc=on,host-nodes=0,policy=bind,align=1G \
	-object memory-backend-file,id=mem1,size=6G,mem-path=/hugetlbfs,merge=off,prealloc=on,host-nodes=0,policy=bind,align=1G \
	-nographic -enable-kvm -machine virt,gic-version=3 -cpu max			\
	-smp 8 -m 12G									\
	-numa node,memdev=mem0,cpus=0-3,nodeid=0					\
	-numa node,memdev=mem1,cpus=4-7,nodeid=1					\
	-drive if=virtio,format=raw,file=ubuntu-22.04.xfs				\
	-object rng-random,filename=/dev/urandom,id=rng0				\
	-device virtio-scsi-pci,id=scsi0						\
	-netdev user,id=net0,hostfwd=tcp::8022-:22					\
	-device virtio-rng-pci,rng=rng0							\
	-device virtio-net-pci,netdev=net0						\
	-kernel arch/arm64/boot/Image							\
	-append "earlycon root=/dev/vda2 secretmem.enable hugepagesz=1G hugepages=0:2,1:2 hugepagesz=32M hugepages=0:2,1:2 default_hugepagesz=2M hugepages=0:64,1:64 hugepagesz=64K hugepages=0:2,1:2"

This starts a VM with 2 numa nodes (needed by ksm and migration tests), with 6G
of memory and 4 CPUs on each node. The kernel command line enables secretmem
(needed for `memfd_secret` test), and preallocates a bunch of huge pages
(divined by reading the comments and source for a bunch of tests that require
huge pages). 128M of the default huge page size, and 4 pages of each of the
other sizes appear to be sufficient. I'm allocating half on each numa node.

Once booted, copy the selftests we just compiled onto it.

On the VM, run the tests:

$ cd path/to/selftests
$ sudo ./run_kselftest.sh

or alternatively:

$ cd path/to/selftests/mm
$ sudo ./run_vmtests.sh


Test Results
------------

TOP-LEVEL SUMMARY: PASS=42 SKIP=4 FAIL=2

Only showing nested tests if they are skipped or failed.

[PASS] hugepage-mmap
[PASS] hugepage-shm
[PASS] map_hugetlb
[PASS] hugepage-mremap
[PASS] hugepage-vmemmap
[PASS] hugetlb-madvise
[PASS] map_fixed_noreplace
[PASS] gup_test -u
[PASS] gup_test -a
[PASS] gup_test -ct -F 0x1 0 19 0x1000
[PASS] gup_longterm
[PASS] uffd-unit-tests
[PASS] uffd-stress anon 20 16
[PASS] uffd-stress hugetlb 128 32
[PASS] uffd-stress hugetlb-private 128 32
[PASS] uffd-stress shmem 20 16
[PASS] uffd-stress shmem-private 20 16
[PASS] compaction_test
[PASS] on-fault-limit
[PASS] map_populate
[PASS] mlock-random-test
[PASS] mlock2-tests
[PASS] mrelease_test
[PASS] mremap_test
[PASS] thuge-gen
[PASS] virtual_address_range
[SKIP] va_high_addr_switch.sh
	# 4K kernel does not support big enough VA space for test
[SKIP] test_vmalloc.sh smoke
	# Test requires test_vmalloc kernel module which isn't present
[PASS] mremap_dontunmap
[SKIP] test_hmm.sh smoke
	# Test requires test_hmm kernel module - see ZONE_DEVICE issue above
[PASS] madv_populate
	[PASS] test_softdirty
		[SKIP] range is not softdirty
		[SKIP] MADV_POPULATE_READ
		[SKIP] range is not softdirty
		[SKIP] MADV_POPULATE_WRITE
		[SKIP] range is softdirty
		# All skipped because arm64 does not support soft-dirty
[PASS] memfd_secret
[PASS] ksm_tests -M -p 10
[PASS] ksm_tests -U
[PASS] ksm_tests -Z -p 10 -z 0
[PASS] ksm_tests -Z -p 10 -z 1
[PASS] ksm_tests -N -m 1
[PASS] ksm_tests -N -m 0
[PASS] ksm_functional_tests
	[SKIP] test_unmerge_uffd_wp
		# UFFD_FEATURE_PAGEFAULT_FLAG_WP not available on arm64
[PASS] ksm_functional_tests
	[SKIP] test_unmerge_uffd_wp
		# UFFD_FEATURE_PAGEFAULT_FLAG_WP not available on arm64
[SKIP] soft-dirty
	# Skipped because arm64 does not support soft-dirty
[FAIL] cow
	[FAIL] vmsplice() + unmap in child ... with hugetlb
	[FAIL] vmsplice() + unmap in child with mprotect() optimization ... with hugetlb
	[FAIL] vmsplice() before fork(), unmap in parent after fork() ... with hugetlb
	[FAIL] vmsplice() + unmap in parent after fork() ... with hugetlb
		# Above are known issues for vmsplice + hugetlb
		# Reproduces on x86
	[SKIP] Basic COW after fork() ... with swapped-out, PTE-mapped THP
	[SKIP] Basic COW after fork() with mprotect() optimization ... with swapped-out, PTE-mapped THP
	[SKIP] vmsplice() + unmap in child ... with swapped-out, PTE-mapped THP
	[SKIP] vmsplice() + unmap in child with mprotect() optimization ... with swapped-out, PTE-mapped THP
	[SKIP] vmsplice() before fork(), unmap in parent after fork() ... with swapped-out, PTE-mapped THP
	[SKIP] vmsplice() + unmap in parent after fork() ... with swapped-out, PTE-mapped THP
	[SKIP] R/O-mapping a page registered as iouring fixed buffer ... with swapped-out, PTE-mapped THP
	[SKIP] fork() with an iouring fixed buffer ... with swapped-out, PTE-mapped THP
	[SKIP] R/O GUP pin on R/O-mapped shared page ... with swapped-out, PTE-mapped THP
	[SKIP] R/O GUP-fast pin on R/O-mapped shared page ... with swapped-out, PTE-mapped THP
	[SKIP] R/O GUP pin on R/O-mapped previously-shared page ... with swapped-out, PTE-mapped THP
	[SKIP] R/O GUP-fast pin on R/O-mapped previously-shared page ... with swapped-out, PTE-mapped THP
	[SKIP] R/O GUP pin on R/O-mapped exclusive page ... with swapped-out, PTE-mapped THP
	[SKIP] R/O GUP-fast pin on R/O-mapped exclusive page ... with swapped-out, PTE-mapped THP
		# Above all skipped due to "MADV_PAGEOUT did not work, is swap enabled?"
		# swap is enabled though
		# Reproduces on x86
	[SKIP] Basic COW after fork() when collapsing after fork() (fully shared)
		# MADV_COLLAPSE failed: Invalid argument
[PASS] khugepaged
[PASS] transhuge-stress -d 20
[PASS] split_huge_page_test
[FAIL] migration
	[FAIL] migration.shared_anon
		# move_pages() reports that the requested page was not migrated
		# after a few iterations.
[PASS] mkdirty
[PASS] mdwe_test


[1] https://lore.kernel.org/lkml/20230704153630.1591122-3-revest@chromium.org/
[2] https://lore.kernel.org/linux-mm/20230711175020.4091336-1-Liam.Howlett@oracle.com/

Thanks,
Ryan


Ryan Roberts (9):
  selftests: Line buffer test program's stdout
  selftests/mm: Give scripts execute permission
  selftests/mm: Skip soft-dirty tests on arm64
  selftests/mm: Enable mrelease_test for arm64
  selftests/mm: Fix thuge-gen test bugs
  selftests/mm: va_high_addr_switch should skip unsupported arm64
    configs
  selftests/mm: Make migration test robust to failure
  selftests/mm: Optionally pass duration to transhuge-stress
  selftests/mm: Run all tests from run_vmtests.sh

 tools/testing/selftests/kselftest/runner.sh   |  5 +-
 tools/testing/selftests/mm/Makefile           | 79 ++++++++++---------
 .../selftests/mm/charge_reserved_hugetlb.sh   |  0
 tools/testing/selftests/mm/check_config.sh    |  0
 .../selftests/mm/hugetlb_reparenting_test.sh  |  0
 tools/testing/selftests/mm/madv_populate.c    | 18 +++--
 tools/testing/selftests/mm/migration.c        | 14 +++-
 tools/testing/selftests/mm/mrelease_test.c    |  1 +
 tools/testing/selftests/mm/run_vmtests.sh     | 23 ++++++
 tools/testing/selftests/mm/settings           |  2 +-
 tools/testing/selftests/mm/soft-dirty.c       |  3 +
 tools/testing/selftests/mm/test_hmm.sh        |  0
 tools/testing/selftests/mm/test_vmalloc.sh    |  0
 tools/testing/selftests/mm/thuge-gen.c        |  4 +-
 tools/testing/selftests/mm/transhuge-stress.c | 12 ++-
 .../selftests/mm/va_high_addr_switch.c        |  3 +-
 .../selftests/mm/va_high_addr_switch.sh       |  0
 tools/testing/selftests/mm/vm_util.c          | 17 ++++
 tools/testing/selftests/mm/vm_util.h          |  1 +
 .../selftests/mm/write_hugetlb_memory.sh      |  0
 20 files changed, 127 insertions(+), 55 deletions(-)
 mode change 100644 => 100755 tools/testing/selftests/mm/charge_reserved_hugetlb.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/check_config.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/hugetlb_reparenting_test.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/test_hmm.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/test_vmalloc.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/va_high_addr_switch.sh
 mode change 100644 => 100755 tools/testing/selftests/mm/write_hugetlb_memory.sh

--
2.25.1

Comments

John Hubbard July 15, 2023, 12:04 a.m. UTC | #1
On 7/13/23 06:54, Ryan Roberts wrote:
> arm64 does not support the soft-dirty PTE bit. However there are tests
> in `madv_populate` and `soft-dirty` which assume it is supported and
> cause spurious failures to be reported when preferred behaviour would be
> to mark the tests as skipped.
> 
> Unfortunately, the only way to determine if the soft-dirty dirty bit is
> supported is to write to a page, then see if the bit is set in
> /proc/self/pagemap. But the tests that we want to conditionally execute
> are testing precicesly this. So if we introduced this feature check, we
> could accedentally turn a real failure (on a system that claims to
> support soft-dirty) into a skip.

...

> diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
> index cc5f144430d4..8a2cd161ec4d 100644
> --- a/tools/testing/selftests/mm/soft-dirty.c
> +++ b/tools/testing/selftests/mm/soft-dirty.c

Hi Ryan,

Probably very similar to what David is requesting: given that arm64
definitively does not support soft dirty, I'd suggest that we not even
*build* the soft dirty tests on arm64!

There is no need to worry about counting, skipping or waiving such
tests, either. Because it's just a non-issue: one does not care about
test status for something that is documented as "this feature is simply
unavailable here".


thanks,
Ryan Roberts July 17, 2023, 8:23 a.m. UTC | #2
On 15/07/2023 01:04, John Hubbard wrote:
> On 7/13/23 06:54, Ryan Roberts wrote:
>> arm64 does not support the soft-dirty PTE bit. However there are tests
>> in `madv_populate` and `soft-dirty` which assume it is supported and
>> cause spurious failures to be reported when preferred behaviour would be
>> to mark the tests as skipped.
>>
>> Unfortunately, the only way to determine if the soft-dirty dirty bit is
>> supported is to write to a page, then see if the bit is set in
>> /proc/self/pagemap. But the tests that we want to conditionally execute
>> are testing precicesly this. So if we introduced this feature check, we
>> could accedentally turn a real failure (on a system that claims to
>> support soft-dirty) into a skip.
> 
> ...
> 
>> diff --git a/tools/testing/selftests/mm/soft-dirty.c
>> b/tools/testing/selftests/mm/soft-dirty.c
>> index cc5f144430d4..8a2cd161ec4d 100644
>> --- a/tools/testing/selftests/mm/soft-dirty.c
>> +++ b/tools/testing/selftests/mm/soft-dirty.c
> 
> Hi Ryan,
> 
> Probably very similar to what David is requesting: given that arm64
> definitively does not support soft dirty, I'd suggest that we not even
> *build* the soft dirty tests on arm64!
> 
> There is no need to worry about counting, skipping or waiving such
> tests, either. Because it's just a non-issue: one does not care about
> test status for something that is documented as "this feature is simply
> unavailable here".

OK fair enough. I'll follow this approach for v2.

Thanks for the review!

> 
> 
> thanks,
Ryan Roberts July 17, 2023, 8:36 a.m. UTC | #3
On 13/07/2023 15:16, Mark Brown wrote:
> On Thu, Jul 13, 2023 at 02:54:32PM +0100, Ryan Roberts wrote:
>> The selftests runner pipes the test program's stdout to tap_prefix. The
>> presence of the pipe means that the test program sets its stdout to be
>> fully buffered (as aposed to line buffered when directly connected to
>> the terminal). The block buffering means that there is often content in
>> the buffer at fork() time, which causes the output to end up duplicated.
>> This was causing problems for mm:cow where test results were duplicated
>> 20-30x.
>>
>> Solve this by using `stdbuf`, when available to force the test program
>> to use line buffered mode. This means previously printf'ed results are
>> flushed out of the program before any fork().
> 
> This is going to be useful in general since not all selftests use the
> kselftest helpers but it'd probably also be good to make
> ksft_print_header() also make the output unbuffered so that if setbuf
> isn't installed on the target system or the tests are run standalone we
> don't run into issues there.  Even if the test isn't corrupting data
> having things unbuffered is going to be good for making sure we don't
> drop any output if the test dies.
> 
>> +		if [ -x /usr/bin/stdbuf ]; then
>> +			stdbuf="/usr/bin/stdbuf --output=L "
>> +		fi
> 
> Might be more robust to use type -p to find stdbuf in case it's in /bin
> or something?

Just looking at making this change; run_selftest.sh's shebang is for sh, and
sh's type doesn't support the -p option. So I'm inclined to leave it as is.
There are multiple other places in the script where /usr/bin is hardcoded when
looking for programs too. Shout if you violently disagree.