diff mbox series

mm/thp: fix __split_huge_pmd_locked() for migration PMD

Message ID 20200921182748.2618107-1-zi.yan@sent.com
State Superseded
Headers show
Series mm/thp: fix __split_huge_pmd_locked() for migration PMD | expand

Commit Message

Zi Yan Sept. 21, 2020, 6:27 p.m. UTC
From: Zi Yan <ziy@nvidia.com>

For 4.19.

[Upstream commitid ec0abae6dcdf7ef88607c869bf35a4b63ce1b370]

From: Ralph Campbell <rcampbell@nvidia.com>
Date: Fri, 18 Sep 2020 21:20:24 -0700
Subject: [PATCH] mm/thp: fix __split_huge_pmd_locked() for migration PMD

A migrating transparent huge page has to already be unmapped.  Otherwise,
the page could be modified while it is being copied to a new page and data
could be lost.  The function __split_huge_pmd() checks for a PMD migration
entry before calling __split_huge_pmd_locked() leading one to think that
__split_huge_pmd_locked() can handle splitting a migrating PMD.

However, the code always increments the page->_mapcount and adjusts the
memory control group accounting assuming the page is mapped.

Also, if the PMD entry is a migration PMD entry, the call to
is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
of migration_entry_to_pfn(pmd_to_swp_entry(pmd)).  Fix these problems by
checking for a PMD migration entry.

Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>	[4.14+]
Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 mm/huge_memory.c | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

Comments

Zi Yan Sept. 21, 2020, 6:41 p.m. UTC | #1
On 21 Sep 2020, at 14:27, Zi Yan wrote:

> From: Zi Yan <ziy@nvidia.com>
>
> For 4.19.

It applies to v5.4.y too.

>
> [Upstream commitid ec0abae6dcdf7ef88607c869bf35a4b63ce1b370]
>
> From: Ralph Campbell <rcampbell@nvidia.com>
> Date: Fri, 18 Sep 2020 21:20:24 -0700
> Subject: [PATCH] mm/thp: fix __split_huge_pmd_locked() for migration PMD
>
> A migrating transparent huge page has to already be unmapped.  Otherwise,
> the page could be modified while it is being copied to a new page and data
> could be lost.  The function __split_huge_pmd() checks for a PMD migration
> entry before calling __split_huge_pmd_locked() leading one to think that
> __split_huge_pmd_locked() can handle splitting a migrating PMD.
>
> However, the code always increments the page->_mapcount and adjusts the
> memory control group accounting assuming the page is mapped.
>
> Also, if the PMD entry is a migration PMD entry, the call to
> is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
> of migration_entry_to_pfn(pmd_to_swp_entry(pmd)).  Fix these problems by
> checking for a PMD migration entry.
>
> Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path")
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Reviewed-by: Yang Shi <shy828301@gmail.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Cc: Jerome Glisse <jglisse@redhat.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Ben Skeggs <bskeggs@redhat.com>
> Cc: Shuah Khan <shuah@kernel.org>
> Cc: <stable@vger.kernel.org>	[4.14+]
> Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  mm/huge_memory.c | 37 ++++++++++++++++++++-----------------
>  1 file changed, 20 insertions(+), 17 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 1443ae6fee9b..811fb2477ecd 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2145,7 +2145,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>  		put_page(page);
>  		add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR);
>  		return;
> -	} else if (is_huge_zero_pmd(*pmd)) {
> +	} else if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
>  		/*
>  		 * FIXME: Do we want to invalidate secondary mmu by calling
>  		 * mmu_notifier_invalidate_range() see comments below inside
> @@ -2233,26 +2233,29 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>  		pte = pte_offset_map(&_pmd, addr);
>  		BUG_ON(!pte_none(*pte));
>  		set_pte_at(mm, addr, pte, entry);
> -		atomic_inc(&page[i]._mapcount);
> -		pte_unmap(pte);
> -	}
> -
> -	/*
> -	 * Set PG_double_map before dropping compound_mapcount to avoid
> -	 * false-negative page_mapped().
> -	 */
> -	if (compound_mapcount(page) > 1 && !TestSetPageDoubleMap(page)) {
> -		for (i = 0; i < HPAGE_PMD_NR; i++)
> +		if (!pmd_migration)
>  			atomic_inc(&page[i]._mapcount);
> +		pte_unmap(pte);
>  	}
>
> -	if (atomic_add_negative(-1, compound_mapcount_ptr(page))) {
> -		/* Last compound_mapcount is gone. */
> -		__dec_node_page_state(page, NR_ANON_THPS);
> -		if (TestClearPageDoubleMap(page)) {
> -			/* No need in mapcount reference anymore */
> +	if (!pmd_migration) {
> +		/*
> +		 * Set PG_double_map before dropping compound_mapcount to avoid
> +		 * false-negative page_mapped().
> +		 */
> +		if (compound_mapcount(page) > 1 && !TestSetPageDoubleMap(page)) {
>  			for (i = 0; i < HPAGE_PMD_NR; i++)
> -				atomic_dec(&page[i]._mapcount);
> +				atomic_inc(&page[i]._mapcount);
> +		}
> +
> +		if (atomic_add_negative(-1, compound_mapcount_ptr(page))) {
> +			/* Last compound_mapcount is gone. */
> +			__dec_node_page_state(page, NR_ANON_THPS);
> +			if (TestClearPageDoubleMap(page)) {
> +				/* No need in mapcount reference anymore */
> +				for (i = 0; i < HPAGE_PMD_NR; i++)
> +					atomic_dec(&page[i]._mapcount);
> +			}
>  		}
>  	}
>
> -- 
> 2.28.0


—
Best Regards,
Yan Zi
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1443ae6fee9b..811fb2477ecd 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2145,7 +2145,7 @@  static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 		put_page(page);
 		add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR);
 		return;
-	} else if (is_huge_zero_pmd(*pmd)) {
+	} else if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
 		/*
 		 * FIXME: Do we want to invalidate secondary mmu by calling
 		 * mmu_notifier_invalidate_range() see comments below inside
@@ -2233,26 +2233,29 @@  static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 		pte = pte_offset_map(&_pmd, addr);
 		BUG_ON(!pte_none(*pte));
 		set_pte_at(mm, addr, pte, entry);
-		atomic_inc(&page[i]._mapcount);
-		pte_unmap(pte);
-	}
-
-	/*
-	 * Set PG_double_map before dropping compound_mapcount to avoid
-	 * false-negative page_mapped().
-	 */
-	if (compound_mapcount(page) > 1 && !TestSetPageDoubleMap(page)) {
-		for (i = 0; i < HPAGE_PMD_NR; i++)
+		if (!pmd_migration)
 			atomic_inc(&page[i]._mapcount);
+		pte_unmap(pte);
 	}
 
-	if (atomic_add_negative(-1, compound_mapcount_ptr(page))) {
-		/* Last compound_mapcount is gone. */
-		__dec_node_page_state(page, NR_ANON_THPS);
-		if (TestClearPageDoubleMap(page)) {
-			/* No need in mapcount reference anymore */
+	if (!pmd_migration) {
+		/*
+		 * Set PG_double_map before dropping compound_mapcount to avoid
+		 * false-negative page_mapped().
+		 */
+		if (compound_mapcount(page) > 1 && !TestSetPageDoubleMap(page)) {
 			for (i = 0; i < HPAGE_PMD_NR; i++)
-				atomic_dec(&page[i]._mapcount);
+				atomic_inc(&page[i]._mapcount);
+		}
+
+		if (atomic_add_negative(-1, compound_mapcount_ptr(page))) {
+			/* Last compound_mapcount is gone. */
+			__dec_node_page_state(page, NR_ANON_THPS);
+			if (TestClearPageDoubleMap(page)) {
+				/* No need in mapcount reference anymore */
+				for (i = 0; i < HPAGE_PMD_NR; i++)
+					atomic_dec(&page[i]._mapcount);
+			}
 		}
 	}