From patchwork Fri Mar 14 18:33:33 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 26285 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ie0-f198.google.com (mail-ie0-f198.google.com [209.85.223.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id CD20D202DD for ; Fri, 14 Mar 2014 18:33:51 +0000 (UTC) Received: by mail-ie0-f198.google.com with SMTP id to1sf9934791ieb.1 for ; Fri, 14 Mar 2014 11:33:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=RdhQq9/UNCzps5PoqYsVGcp77Fm56e+6iK2JLRQTw+k=; b=YDfHWZJwTzjEtS3Ot+WCPTyvxcxNiQuQ8bE+wKXtLCKWuGnikPwQ0pMp3zbhrxCocA dBeYEwUz+HxniPbPJbflfdukreEcM2djMpvfGQLlHtfGalZ/e+095fDFdL87Al4IfSrE lYeLWQM5+0IRubyF20ip/ICRVzWyg73yH1BPYoMEAkI/GllK3Xyzse+1dinvOMgSl+ZL QGitiBxIoiiPOQEhWxOox7cbYrhKtBTRBzmIGT+gZTnjRR091ChvQ0vZQERpUa/jhv7H yopPUA6tlnK9/81jqs3+BMv5Lh4c/YsnXLHuHY1utY0jXU88F0R1Aw6RkRfF/GCyErD/ gNZw== X-Gm-Message-State: ALoCoQkPu6Rc/z3G9j3L6TcCKpiU6Kxs6JRpb+YckZNFCZ334mAh3P3yKIy8I25CdXwrH9/hHX62 X-Received: by 10.43.18.133 with SMTP id qg5mr3228847icb.13.1394822031289; Fri, 14 Mar 2014 11:33:51 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.97.229 with SMTP id m92ls844156qge.47.gmail; Fri, 14 Mar 2014 11:33:51 -0700 (PDT) X-Received: by 10.52.11.100 with SMTP id p4mr231836vdb.52.1394822031173; Fri, 14 Mar 2014 11:33:51 -0700 (PDT) Received: from mail-ve0-f177.google.com (mail-ve0-f177.google.com [209.85.128.177]) by mx.google.com with ESMTPS id uc1si2375341vcb.128.2014.03.14.11.33.51 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 14 Mar 2014 11:33:51 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.128.177 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.128.177; Received: by mail-ve0-f177.google.com with SMTP id sa20so3075092veb.8 for ; Fri, 14 Mar 2014 11:33:51 -0700 (PDT) X-Received: by 10.58.172.132 with SMTP id bc4mr274019vec.45.1394822031036; Fri, 14 Mar 2014 11:33:51 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.78.9 with SMTP id i9csp43111vck; Fri, 14 Mar 2014 11:33:50 -0700 (PDT) X-Received: by 10.68.143.196 with SMTP id sg4mr10493961pbb.155.1394822030118; Fri, 14 Mar 2014 11:33:50 -0700 (PDT) Received: from mail-pb0-f45.google.com (mail-pb0-f45.google.com [209.85.160.45]) by mx.google.com with ESMTPS id gj8si6515117pbc.174.2014.03.14.11.33.49 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 14 Mar 2014 11:33:50 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.160.45 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=209.85.160.45; Received: by mail-pb0-f45.google.com with SMTP id uo5so2969663pbc.32 for ; Fri, 14 Mar 2014 11:33:49 -0700 (PDT) X-Received: by 10.68.234.2 with SMTP id ua2mr10594084pbc.81.1394822029689; Fri, 14 Mar 2014 11:33:49 -0700 (PDT) Received: from buildbox.hsd1.or.comcast.net (c-67-170-153-23.hsd1.or.comcast.net. [67.170.153.23]) by mx.google.com with ESMTPSA id dk1sm18837041pbc.46.2014.03.14.11.33.47 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 14 Mar 2014 11:33:49 -0700 (PDT) From: John Stultz To: LKML Cc: John Stultz , Andrew Morton , Android Kernel Team , Johannes Weiner , Robert Love , Mel Gorman , Hugh Dickins , Dave Hansen , Rik van Riel , Dmitry Adamushko , Neil Brown , Andrea Arcangeli , Mike Hommey , Taras Glek , Dhaval Giani , Jan Kara , KOSAKI Motohiro , Michel Lespinasse , Minchan Kim , "linux-mm@kvack.org" Subject: [PATCH 3/3] vrange: Add page purging logic & SIGBUS trap Date: Fri, 14 Mar 2014 11:33:33 -0700 Message-Id: <1394822013-23804-4-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1394822013-23804-1-git-send-email-john.stultz@linaro.org> References: <1394822013-23804-1-git-send-email-john.stultz@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: john.stultz@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.128.177 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Finally, this patch adds the hooks in the vmscan logic to discard volatile pages and mark their pte as purged. With this, volatile pages will be purged under pressure, and their ptes swap entry's marked. If the purged pages are accessed before being marked non-volatile, we catch this and send a SIGBUS. This is a simplified implementation that uses logic from Minchan's earlier efforts, so credit to Minchan for his work. Cc: Andrew Morton Cc: Android Kernel Team Cc: Johannes Weiner Cc: Robert Love Cc: Mel Gorman Cc: Hugh Dickins Cc: Dave Hansen Cc: Rik van Riel Cc: Dmitry Adamushko Cc: Neil Brown Cc: Andrea Arcangeli Cc: Mike Hommey Cc: Taras Glek Cc: Dhaval Giani Cc: Jan Kara Cc: KOSAKI Motohiro Cc: Michel Lespinasse Cc: Minchan Kim Cc: linux-mm@kvack.org Signed-off-by: John Stultz --- include/linux/vrange.h | 2 ++ mm/internal.h | 2 -- mm/memory.c | 21 +++++++++++ mm/rmap.c | 5 +++ mm/vmscan.c | 12 +++++++ mm/vrange.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 137 insertions(+), 2 deletions(-) diff --git a/include/linux/vrange.h b/include/linux/vrange.h index c4a1616..b18551f 100644 --- a/include/linux/vrange.h +++ b/include/linux/vrange.h @@ -7,6 +7,8 @@ #define VRANGE_NONVOLATILE 0 #define VRANGE_VOLATILE 1 +extern int discard_vpage(struct page *page); + static inline swp_entry_t swp_entry_mk_vrange_purged(void) { return swp_entry(SWP_VRANGE_PURGED, 0); diff --git a/mm/internal.h b/mm/internal.h index 29e1e76..ea66bf9 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -225,10 +225,8 @@ static inline void mlock_migrate_page(struct page *newpage, struct page *page) extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern unsigned long vma_address(struct page *page, struct vm_area_struct *vma); -#endif #else /* !CONFIG_MMU */ static inline int mlocked_vma_newpage(struct vm_area_struct *v, struct page *p) { diff --git a/mm/memory.c b/mm/memory.c index 22dfa61..7ea9712 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -60,6 +60,7 @@ #include #include #include +#include #include #include @@ -3643,6 +3644,8 @@ static int handle_pte_fault(struct mm_struct *mm, entry = *pte; if (!pte_present(entry)) { + swp_entry_t vrange_entry; +retry: if (pte_none(entry)) { if (vma->vm_ops) { if (likely(vma->vm_ops->fault)) @@ -3652,6 +3655,24 @@ static int handle_pte_fault(struct mm_struct *mm, return do_anonymous_page(mm, vma, address, pte, pmd, flags); } + + vrange_entry = pte_to_swp_entry(entry); + if (unlikely(entry_is_vrange_purged(vrange_entry))) { + if (vma->vm_flags & VM_VOLATILE) + return VM_FAULT_SIGBUS; + + /* zap pte */ + ptl = pte_lockptr(mm, pmd); + spin_lock(ptl); + if (unlikely(!pte_same(*pte, entry))) + goto unlock; + flush_cache_page(vma, address, pte_pfn(*pte)); + ptep_clear_flush(vma, address, pte); + pte_unmap_unlock(pte, ptl); + goto retry; + } + + if (pte_file(entry)) return do_nonlinear_fault(mm, vma, address, pte, pmd, flags, entry); diff --git a/mm/rmap.c b/mm/rmap.c index d9d4231..2b6f079 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -728,6 +728,11 @@ int page_referenced_one(struct page *page, struct vm_area_struct *vma, referenced++; } pte_unmap_unlock(pte, ptl); + if (vma->vm_flags & VM_VOLATILE) { + pra->mapcount = 0; + pra->vm_flags |= VM_VOLATILE; + return SWAP_FAIL; + } } if (referenced) { diff --git a/mm/vmscan.c b/mm/vmscan.c index a9c74b4..c5c0ee0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -683,6 +684,7 @@ enum page_references { PAGEREF_RECLAIM, PAGEREF_RECLAIM_CLEAN, PAGEREF_KEEP, + PAGEREF_DISCARD, PAGEREF_ACTIVATE, }; @@ -703,6 +705,13 @@ static enum page_references page_check_references(struct page *page, if (vm_flags & VM_LOCKED) return PAGEREF_RECLAIM; + /* + * If volatile page is reached on LRU's tail, we discard the + * page without considering recycle the page. + */ + if (vm_flags & VM_VOLATILE) + return PAGEREF_DISCARD; + if (referenced_ptes) { if (PageSwapBacked(page)) return PAGEREF_ACTIVATE; @@ -930,6 +939,9 @@ static unsigned long shrink_page_list(struct list_head *page_list, switch (references) { case PAGEREF_ACTIVATE: goto activate_locked; + case PAGEREF_DISCARD: + if (may_enter_fs && discard_vpage(page) == 0) + goto free_it; case PAGEREF_KEEP: goto keep_locked; case PAGEREF_RECLAIM: diff --git a/mm/vrange.c b/mm/vrange.c index 844571b..fc9906f 100644 --- a/mm/vrange.c +++ b/mm/vrange.c @@ -205,3 +205,100 @@ SYSCALL_DEFINE4(vrange, unsigned long, start, out: return ret; } + +static void try_to_discard_one(struct page *page, struct vm_area_struct *vma) +{ + struct mm_struct *mm = vma->vm_mm; + pte_t *pte; + pte_t pteval; + spinlock_t *ptl; + unsigned long addr; + + VM_BUG_ON(!PageLocked(page)); + + addr = vma_address(page, vma); + pte = page_check_address(page, mm, addr, &ptl, 0); + if (!pte) + return; + + BUG_ON(vma->vm_flags & (VM_SPECIAL|VM_LOCKED|VM_MIXEDMAP|VM_HUGETLB)); + + flush_cache_page(vma, addr, page_to_pfn(page)); + pteval = ptep_clear_flush(vma, addr, pte); + + update_hiwater_rss(mm); + if (PageAnon(page)) + dec_mm_counter(mm, MM_ANONPAGES); + else + dec_mm_counter(mm, MM_FILEPAGES); + + page_remove_rmap(page); + page_cache_release(page); + + set_pte_at(mm, addr, pte, + swp_entry_to_pte(swp_entry_mk_vrange_purged())); + + pte_unmap_unlock(pte, ptl); + mmu_notifier_invalidate_page(mm, addr); + +} + + +static int try_to_discard_anon_vpage(struct page *page) +{ + struct anon_vma *anon_vma; + struct anon_vma_chain *avc; + pgoff_t pgoff; + + anon_vma = page_lock_anon_vma_read(page); + if (!anon_vma) + return -1; + + pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + /* + * During interating the loop, some processes could see a page as + * purged while others could see a page as not-purged because we have + * no global lock between parent and child for protecting vrange system + * call during this loop. But it's not a problem because the page is + * not *SHARED* page but *COW* page so parent and child can see other + * data anytime. The worst case by this race is a page was purged + * but couldn't be discarded so it makes unnecessary page fault but + * it wouldn't be severe. + */ + anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) { + struct vm_area_struct *vma = avc->vma; + + if (!(vma->vm_flags & VM_VOLATILE)) + continue; + try_to_discard_one(page, vma); + } + page_unlock_anon_vma_read(anon_vma); + return 0; +} + + +static int try_to_discard_vpage(struct page *page) +{ + if (PageAnon(page)) + return try_to_discard_anon_vpage(page); + return -1; +} + + +int discard_vpage(struct page *page) +{ + VM_BUG_ON(!PageLocked(page)); + VM_BUG_ON(PageLRU(page)); + + if (!try_to_discard_vpage(page)) { + if (PageSwapCache(page)) + try_to_free_swap(page); + + if (page_freeze_refs(page, 1)) { + unlock_page(page); + return 0; + } + } + + return 1; +}