Message ID | 20240624163348.1751454-1-jiaqiyan@google.com |
---|---|
Headers | show |
Series | Userspace controls soft-offline pages | expand |
On 2024/6/25 0:33, Jiaqi Yan wrote: > Add regression and new tests when hugepage has correctable memory ... > diff --git a/tools/testing/selftests/mm/hugetlb-soft-offline.c b/tools/testing/selftests/mm/hugetlb-soft-offline.c > new file mode 100644 > index 000000000000..16fe52f972e2 > --- /dev/null > +++ b/tools/testing/selftests/mm/hugetlb-soft-offline.c > @@ -0,0 +1,227 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Test soft offline behavior for HugeTLB pages: > + * - if enable_soft_offline = 0, hugepages should stay intact and soft > + * offlining failed with EINVAL. s/failed with EINVAL/failed with EOPNOTSUPP/g > + * - if enable_soft_offline = 1, a hugepage should be dissolved and > + * nr_hugepages/free_hugepages should be reduced by 1. > + * > + * Before running, make sure more than 2 hugepages of default_hugepagesz > + * are allocated. For example, if /proc/meminfo/Hugepagesize is 2048kB: > + * echo 8 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > + */ > + ... > +static void test_soft_offline_common(int enable_soft_offline) > +{ > + int fd; > + int expect_errno = enable_soft_offline ? 0 : EOPNOTSUPP; > + struct statfs file_stat; > + unsigned long hugepagesize_kb = 0; > + unsigned long nr_hugepages_before = 0; > + unsigned long nr_hugepages_after = 0; > + int ret; > + > + ksft_print_msg("Test soft-offline when enabled_soft_offline=%d\n", > + enable_soft_offline); > + > + fd = create_hugetlbfs_file(&file_stat); > + if (fd < 0) { > + ksft_exit_fail_msg("Failed to create hugetlbfs file\n"); > + return; > + } > + > + hugepagesize_kb = file_stat.f_bsize / 1024; > + ksft_print_msg("Hugepagesize is %ldkB\n", hugepagesize_kb); > + > + if (set_enable_soft_offline(enable_soft_offline)) { > + ksft_exit_fail_msg("Failed to set enable_soft_offline\n"); Call destroy_hugetlbfs_file() in error path? > + return; > + } > + > + if (read_nr_hugepages(hugepagesize_kb, &nr_hugepages_before) != 0) { > + ksft_exit_fail_msg("Failed to read nr_hugepages\n"); > + return; > + } > + > + ksft_print_msg("Before MADV_SOFT_OFFLINE nr_hugepages=%ld\n", > + nr_hugepages_before); > + > + ret = do_soft_offline(fd, 2 * file_stat.f_bsize, expect_errno); > + > + if (read_nr_hugepages(hugepagesize_kb, &nr_hugepages_after) != 0) { > + ksft_exit_fail_msg("Failed to read nr_hugepages\n"); > + return; > + } > + > + ksft_print_msg("After MADV_SOFT_OFFLINE nr_hugepages=%ld\n", > + nr_hugepages_after); > + > + if (enable_soft_offline) { > + if (nr_hugepages_before != nr_hugepages_after + 1) { > + ksft_test_result_fail("MADV_SOFT_OFFLINE should reduced 1 hugepage\n"); > + return; > + } > + } else { > + if (nr_hugepages_before != nr_hugepages_after) { > + ksft_test_result_fail("MADV_SOFT_OFFLINE reduced %lu hugepages\n", > + nr_hugepages_before - nr_hugepages_after); > + return; > + } > + } > + > + ksft_test_result(ret == 0, > + "Test soft-offline when enabled_soft_offline=%d\n", > + enable_soft_offline); Call destroy_hugetlbfs_file() when test finished ? Thanks. .
On Mon, Jun 24, 2024 at 11:41 PM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2024/6/25 0:33, Jiaqi Yan wrote: > > Logs from soft_offline_page and soft_offline_in_use_page have > > different formats than majority of the memory failure code: > > > > "Memory failure: 0x${pfn}: ${lower_case_message}" > > > > Convert them to the following format: > > > > "Soft offline: 0x${pfn}: ${lower_case_message}" > > > > No functional change in this commit. > > > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > > --- > > mm/memory-failure.c | 15 +++++++++------ > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > index d3c830e817e3..2a097af7da0e 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -2631,6 +2631,9 @@ int unpoison_memory(unsigned long pfn) > > } > > EXPORT_SYMBOL(unpoison_memory); > > > > +#undef pr_fmt > > +#define pr_fmt(fmt) "Soft offline: " fmt > > + > > static bool mf_isolate_folio(struct folio *folio, struct list_head *pagelist) > > { > > bool isolated = false; > > @@ -2686,7 +2689,7 @@ static int soft_offline_in_use_page(struct page *page) > > > > if (!huge && folio_test_large(folio)) { > > if (try_to_split_thp_page(page)) { > > - pr_info("soft offline: %#lx: thp split failed\n", pfn); > > + pr_info("%#lx: thp split failed\n", pfn); > > return -EBUSY; > > } > > folio = page_folio(page); > > @@ -2698,7 +2701,7 @@ static int soft_offline_in_use_page(struct page *page) > > if (PageHWPoison(page)) { > > folio_unlock(folio); > > folio_put(folio); > > - pr_info("soft offline: %#lx page already poisoned\n", pfn); > > + pr_info("%#lx page already poisoned\n", pfn); > > Again, it's better to be "%#lx: page" to make log format consistent. Ah, I missed a ":", thanks for catching this! > Thanks. > .