From patchwork Tue Jun 13 10:28:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 104377 Delivered-To: patch@linaro.org Received: by 10.182.29.35 with SMTP id g3csp392800obh; Tue, 13 Jun 2017 03:28:41 -0700 (PDT) X-Received: by 10.98.64.6 with SMTP id n6mr4514529pfa.196.1497349721448; Tue, 13 Jun 2017 03:28:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1497349721; cv=none; d=google.com; s=arc-20160816; b=p0mz22RdZowsDbaLWl29INOc2xIrTtmdUGtAT5J2fTFXTF8/LXvQzQdydj2X27A67Q dg8uWbP7m1uBiTJpBNH23zEZJ+A1etNe8cxCMqjT6lrNLs5KJRj3C8hbaElWitoUu2+S PhefxrpPGWl48zOwuSyNwxdFlgGdiuvKA/9gBVt/ksCdsUouFqlOo9fPD171tx5O4IX7 CFM0cA7nczir+/UoPIj6CApaJB5GpVaoi5yhQFK/AVX/I4IpENLrn/oH/e5UlJYLB88R kjCjtST8pws/2YgK2itURMgmhcMTrWKEvSONiNbzepoHercIBYjDgbBK0zxSnvgHdOZn 5UUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=6esyNXK6Vh379w8NdkZkaHx8FmXEQzt/EOLWrX9LfFo=; b=gPjpg9PkvLEWfETWVx6wx93eMIFSD16g44nmhMUhHphbrm6vMIjvB4B3tzpldbjfOw QIAfrInucIyasBjTvdEzCyBSxQzAiGGO0DcrPR3WhkjNURZjw0cIlenNZKRzp5ZUBMtU vl4TMHTnLZuHs+Jw/FzBODblolgZOCC3jDOv7CzGIuVyxLesfnWQCGs3ybDXbYGyf5Wa 9IhjxFhfZb5rzrBBVkNmYIxixXqkw1+vE8fzqtx+jClpcmSXxeYiFI/yoYxIBkwx9GJR kqEF+FOlc2KsEUtou06zUeKK3HJzw/w1IccMNmdcR3lpuevpO6nl3EcRl28Khno7PU8F LqkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k62si8991107pfj.411.2017.06.13.03.28.41; Tue, 13 Jun 2017 03:28:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752674AbdFMK2h (ORCPT + 25 others); Tue, 13 Jun 2017 06:28:37 -0400 Received: from foss.arm.com ([217.140.101.70]:46144 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752102AbdFMK2f (ORCPT ); Tue, 13 Jun 2017 06:28:35 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E3FFF15A2; Tue, 13 Jun 2017 03:28:34 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B69063F59C; Tue, 13 Jun 2017 03:28:34 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id C474F1AE01AE; Tue, 13 Jun 2017 11:28:44 +0100 (BST) From: Will Deacon To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, Punit.Agrawal@arm.com, mgorman@suse.de, steve.capper@arm.com, vbabka@suse.cz, Will Deacon Subject: [PATCH v2 1/3] mm: numa: avoid waiting on freed migrated pages Date: Tue, 13 Jun 2017 11:28:40 +0100 Message-Id: <1497349722-6731-2-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1497349722-6731-1-git-send-email-will.deacon@arm.com> References: <1497349722-6731-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mark Rutland In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by waiting until the pmd is unlocked before we return and retry. However, we can race with migrate_misplaced_transhuge_page(): // do_huge_pmd_numa_page // migrate_misplaced_transhuge_page() // Holds 0 refs on page // Holds 2 refs on page vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); /* ... */ if (pmd_trans_migrating(*vmf->pmd)) { page = pmd_page(*vmf->pmd); spin_unlock(vmf->ptl); ptl = pmd_lock(mm, pmd); if (page_count(page) != 2)) { /* roll back */ } /* ... */ mlock_migrate_page(new_page, page); /* ... */ spin_unlock(ptl); put_page(page); put_page(page); // page freed here wait_on_page_locked(page); goto out; } This can result in the freed page having its waiters flag set unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the page alloc/free functions. This has been observed on arm64 KVM guests. We can avoid this by having do_huge_pmd_numa_page() take a reference on the page before dropping the pmd lock, mirroring what we do in __migration_entry_wait(). When we hit the race, migrate_misplaced_transhuge_page() will see the reference and abort the migration, as it may do today in other cases. Acked-by: Steve Capper Acked-by: Kirill A. Shutemov Acked-by: Vlastimil Babka Fixes: b8916634b77bffb2 ("mm: Prevent parallel splits during THP migration") Signed-off-by: Mark Rutland Signed-off-by: Will Deacon --- mm/huge_memory.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) -- 2.1.4 diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a84909cf20d3..88c6167f194d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1426,8 +1426,11 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) */ if (unlikely(pmd_trans_migrating(*vmf->pmd))) { page = pmd_page(*vmf->pmd); + if (!get_page_unless_zero(page)) + goto out_unlock; spin_unlock(vmf->ptl); wait_on_page_locked(page); + put_page(page); goto out; } @@ -1459,9 +1462,12 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) /* Migration could have started since the pmd_trans_migrating check */ if (!page_locked) { + page_nid = -1; + if (!get_page_unless_zero(page)) + goto out_unlock; spin_unlock(vmf->ptl); wait_on_page_locked(page); - page_nid = -1; + put_page(page); goto out; } From patchwork Tue Jun 13 10:28:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 104379 Delivered-To: patch@linaro.org Received: by 10.182.29.35 with SMTP id g3csp393081obh; Tue, 13 Jun 2017 03:29:29 -0700 (PDT) X-Received: by 10.98.147.142 with SMTP id r14mr32647835pfk.168.1497349769670; Tue, 13 Jun 2017 03:29:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1497349769; cv=none; d=google.com; s=arc-20160816; b=Zu1GTDpDti1L78a5wR1vJN0Mg5QcFVAK8yg6y0jtkH4JhS9uCuLeA99HTK5DS2YMUL 4jBgf44aA0qow0k/yEDJRYAQFtA6u6x9fV8ABwE6PV72Awp9BRuA5+1+bEoekGfcjGME lSeO5LjUI3kc+BnvHHGcQjlLttLfSRFPOqS9L2wb2vvfazzMZT6QdUV5c/l6kckBDhI/ BWz00QD8LpmspWr+up8EZU8ynvBuLUc4uOHG95Nwd2EwRXO6NdlF5+8xLNPiwGpz+nwK Wy9G9YtT9Fyo4Uuutmxz000MDRACHBLkDX2XmyFp6VRAr9cgDAkTtGhKVcB5Vjm01Ypi mEOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=oGHPRLFjvlNfehM0CItXKBzVfPSp1g4bFNzfKkIFhFE=; b=iXiznDOUjpcztn+XeZPm+4dxi2b4VQv+sttzcDvHo4FC5uwrwHdDGa5cPBE4JfeHCi kP9SFEawgPK5nke9TBE0Bc9URMTmIkphbWqQ7d3E7bcuBngrl0TsWawfdIBsVag+0YJZ FVR9CI+AN38YWabLJPG216lOOgVi4O+pBc4vqn56bzfkhGmxQ+M+WAA9w6nvnkAbT/62 doMkye7z1koN4CuF1bNLcZsnhpczTG/RfwXco6K2jNYL61DXf2Ic6yNPheTWyh9BTl1D 8m8Gt6eXut5ceYGiLiwWcEav3tJrudeD2P63JLM7/hopXT8MkojjXEXmP21/7FmDGZsu TezA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l2si9010226pfg.217.2017.06.13.03.29.29; Tue, 13 Jun 2017 03:29:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753099AbdFMK3R (ORCPT + 25 others); Tue, 13 Jun 2017 06:29:17 -0400 Received: from foss.arm.com ([217.140.101.70]:46154 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752156AbdFMK2f (ORCPT ); Tue, 13 Jun 2017 06:28:35 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F3EC615B2; Tue, 13 Jun 2017 03:28:34 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C690A3F889; Tue, 13 Jun 2017 03:28:34 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id D6A1C1AE30E8; Tue, 13 Jun 2017 11:28:44 +0100 (BST) From: Will Deacon To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, Punit.Agrawal@arm.com, mgorman@suse.de, steve.capper@arm.com, vbabka@suse.cz, Will Deacon Subject: [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses Date: Tue, 13 Jun 2017 11:28:41 +0100 Message-Id: <1497349722-6731-3-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1497349722-6731-1-git-send-email-will.deacon@arm.com> References: <1497349722-6731-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org page_ref_freeze and page_ref_unfreeze are designed to be used as a pair, wrapping a critical section where struct pages can be modified without having to worry about consistency for a concurrent fast-GUP. Whilst page_ref_freeze has full barrier semantics due to its use of atomic_cmpxchg, page_ref_unfreeze is implemented using atomic_set, which doesn't provide any barrier semantics and allows the operation to be reordered with respect to page modifications in the critical section. This patch ensures that page_ref_unfreeze is ordered after any critical section updates, by invoking smp_mb() prior to the atomic_set. Cc: "Kirill A. Shutemov" Acked-by: Steve Capper Signed-off-by: Will Deacon --- include/linux/page_ref.h | 1 + 1 file changed, 1 insertion(+) -- 2.1.4 Acked-by: Kirill A. Shutemov diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h index 610e13271918..1fd71733aa68 100644 --- a/include/linux/page_ref.h +++ b/include/linux/page_ref.h @@ -174,6 +174,7 @@ static inline void page_ref_unfreeze(struct page *page, int count) VM_BUG_ON_PAGE(page_count(page) != 0, page); VM_BUG_ON(count == 0); + smp_mb(); atomic_set(&page->_refcount, count); if (page_ref_tracepoint_active(__tracepoint_page_ref_unfreeze)) __page_ref_unfreeze(page, count); From patchwork Tue Jun 13 10:28:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 104380 Delivered-To: patch@linaro.org Received: by 10.140.91.77 with SMTP id y71csp332106qgd; Tue, 13 Jun 2017 03:29:46 -0700 (PDT) X-Received: by 10.84.218.206 with SMTP id g14mr60966191plm.85.1497349786604; Tue, 13 Jun 2017 03:29:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1497349786; cv=none; d=google.com; s=arc-20160816; b=VRYKLj1pOy9nfOIJ8wVStAKVulf8OvQbyZP2u+9AjhBAcDW0o6veJmhwPdL8u22DmT oCVv8hMXP+YSn+jHLYnC1TlOFSE3zIy9fS3mfHHzTU1BAn7Ddef+yNL6h2OQEe505IkT UAQ5nhX5iAx8UZcvDEGcYfiTsF3khzqOCV861pQma1KtyQJz9MmzGVObjfvUB7UhVp03 5P3afvjr9ARtogI48DFPE4d/re+G7b4DIpkBCXrSOQJ3dYBUp6t8J9lwRDHfMJL4yHzT FThe/sLtVX9MHMt1qtPLyl+FffMQDcYTAmNO+fG8C8OgPWQUkvlX3asXNprNgFPD1Q3f Behg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=WKNfeVOMnamfGFsjeTN65hly4shCoFZqut4+D1X1OE4=; b=HlYGVZkJzrF3+eEGjV/YKdwJG5yUHMvOmFAw6f9VNegcSlBXts1Y6oi2z8sknXLsZ/ 98CmL2xptvg0ozptMXI+Ltl+uQXzSQIdd/c5kAHGtaUGB4n0mYIjzWPbGnXaKF1EOYCS Htrdm5rQdkbV0ppObNGWQJOseTMqSndfBAbRGO3uB1y53SRIqOdxQpose285IjuFrhhm a/L651pMpCmp6dJye0q8cC2R0djLjvudgGI1x3VcSK7ALGUry+2bsPtaCD1AVcC9NT0i KlKEX9uE7y6ydvFGMr4HJp4g6A9KWuR5MokCZm0MN9jRVXQitAHPYuJosKBp+UQ5cBZr gLCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v62si9079364pfb.42.2017.06.13.03.29.46; Tue, 13 Jun 2017 03:29:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752985AbdFMK3Q (ORCPT + 25 others); Tue, 13 Jun 2017 06:29:16 -0400 Received: from foss.arm.com ([217.140.101.70]:46160 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752168AbdFMK2f (ORCPT ); Tue, 13 Jun 2017 06:28:35 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0B21715BE; Tue, 13 Jun 2017 03:28:35 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D1BCB3F93D; Tue, 13 Jun 2017 03:28:34 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id E62981AE322D; Tue, 13 Jun 2017 11:28:44 +0100 (BST) From: Will Deacon To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, Punit.Agrawal@arm.com, mgorman@suse.de, steve.capper@arm.com, vbabka@suse.cz, Will Deacon Subject: [PATCH v2 3/3] mm: migrate: Stabilise page count when migrating transparent hugepages Date: Tue, 13 Jun 2017 11:28:42 +0100 Message-Id: <1497349722-6731-4-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1497349722-6731-1-git-send-email-will.deacon@arm.com> References: <1497349722-6731-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When migrating a transparent hugepage, migrate_misplaced_transhuge_page guards itself against a concurrent fastgup of the page by checking that the page count is equal to 2 before and after installing the new pmd. If the page count changes, then the pmd is reverted back to the original entry, however there is a small window where the new (possibly writable) pmd is installed and the underlying page could be written by userspace. Restoring the old pmd could therefore result in loss of data. This patch fixes the problem by freezing the page count whilst updating the page tables, which protects against a concurrent fastgup without the need to restore the old pmd in the failure case (since the page count can no longer change under our feet). Cc: Mel Gorman Acked-by: Kirill A. Shutemov Signed-off-by: Will Deacon --- mm/migrate.c | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) -- 2.1.4 diff --git a/mm/migrate.c b/mm/migrate.c index 89a0a1707f4c..8b21f1b1ec6e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1913,7 +1913,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, int page_lru = page_is_file_cache(page); unsigned long mmun_start = address & HPAGE_PMD_MASK; unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE; - pmd_t orig_entry; /* * Rate-limit the amount of data that is being migrated to a node. @@ -1956,8 +1955,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, /* Recheck the target PMD */ mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); ptl = pmd_lock(mm, pmd); - if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) { -fail_putback: + if (unlikely(!pmd_same(*pmd, entry) || !page_ref_freeze(page, 2))) { spin_unlock(ptl); mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); @@ -1979,7 +1977,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, goto out_unlock; } - orig_entry = *pmd; entry = mk_huge_pmd(new_page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); @@ -1996,15 +1993,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, set_pmd_at(mm, mmun_start, pmd, entry); update_mmu_cache_pmd(vma, address, &entry); - if (page_count(page) != 2) { - set_pmd_at(mm, mmun_start, pmd, orig_entry); - flush_pmd_tlb_range(vma, mmun_start, mmun_end); - mmu_notifier_invalidate_range(mm, mmun_start, mmun_end); - update_mmu_cache_pmd(vma, address, &entry); - page_remove_rmap(new_page, true); - goto fail_putback; - } - + page_ref_unfreeze(page, 2); mlock_migrate_page(new_page, page); page_remove_rmap(page, true); set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED);