From patchwork Thu Jul 26 23:55:48 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 10287 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id E2BB224000 for ; Thu, 26 Jul 2012 23:56:13 +0000 (UTC) Received: from mail-yw0-f52.google.com (mail-yw0-f52.google.com [209.85.213.52]) by fiordland.canonical.com (Postfix) with ESMTP id B27F8A18FF1 for ; Thu, 26 Jul 2012 23:56:13 +0000 (UTC) Received: by mail-yw0-f52.google.com with SMTP id p61so2635698yhp.11 for ; Thu, 26 Jul 2012 16:56:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf:from:to:cc :subject:date:message-id:x-mailer:in-reply-to:references :x-content-scanned:x-cbid:x-gm-message-state; bh=cpTqEhliu3rPAP4wbQqHLqPNU5KSPVGFXB6FttXUOnI=; b=X7NLjHnOGI1XmJ7IzALQWrTumSyy4/TOwrPyDSTszqBiYwEGjsx1Hqzl56kF8giO70 l+N3tAecgEOZiGLYXC/Pnv80/sHydFcPse/jucQLGW00pt2Xts1rYjSzZuYI7H/kLC/X QXonXDVDyGyBSQmLdrLick6YaWi8fc2pwEc3U0ZXT+eONd0NP0DZLJ8Z1lL/yeNcWdxB ogwubTMNR2D1zzWEtMlyaFJQkQwfpB5vjrpEZJmKoiBsmc6exxG0Ijix19iNbzoWnDXS mPi1W+ZULEWK2gORYxIQyoPq49BLl6G75PQWypKRTBjv0+iWf09IPllcCl2hNK7KwAFq DnwQ== Received: by 10.50.159.135 with SMTP id xc7mr3090071igb.1.1343346972959; Thu, 26 Jul 2012 16:56:12 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.43.93.3 with SMTP id bs3csp1109icc; Thu, 26 Jul 2012 16:56:12 -0700 (PDT) Received: by 10.68.116.203 with SMTP id jy11mr8705750pbb.129.1343346972353; Thu, 26 Jul 2012 16:56:12 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com. [32.97.110.149]) by mx.google.com with ESMTPS id ot3si1707701pbc.11.2012.07.26.16.56.11 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 26 Jul 2012 16:56:12 -0700 (PDT) Received-SPF: neutral (google.com: 32.97.110.149 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=32.97.110.149; Authentication-Results: mx.google.com; spf=neutral (google.com: 32.97.110.149 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) smtp.mail=john.stultz@linaro.org Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Jul 2012 17:56:11 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 26 Jul 2012 17:56:09 -0600 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id D226F1FF001B; Thu, 26 Jul 2012 23:56:06 +0000 (WET) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6QNu8OZ140364; Thu, 26 Jul 2012 17:56:08 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6QNu8p5015008; Thu, 26 Jul 2012 17:56:08 -0600 Received: from kernel-pok.stglabs.ibm.com (kernel.stglabs.ibm.com [9.114.214.19]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q6QNtumS014268; Thu, 26 Jul 2012 17:56:07 -0600 From: John Stultz To: Dave Hansen Cc: John Stultz Subject: [PATCH 4/5] [RFC][HACK] Add VOLATILE_LRU support to the VM Date: Thu, 26 Jul 2012 19:55:48 -0400 Message-Id: <1343346949-53715-5-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1343346949-53715-1-git-send-email-john.stultz@linaro.org> References: <1343346949-53715-1-git-send-email-john.stultz@linaro.org> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12072623-7282-0000-0000-00000B5A8084 X-Gm-Message-State: ALoCoQmDeo7MOmIDxl9TxZb/0L0BlkCNreVGSiHlzViSH8Ka5A+xiqmOZ2qCtKJidNfF6ZkMwOBu --- include/linux/fs.h | 1 + include/linux/mm_inline.h | 2 + include/linux/mmzone.h | 1 + include/linux/page-flags.h | 3 + include/linux/swap.h | 3 + mm/memcontrol.c | 1 + mm/page_alloc.c | 1 + mm/swap.c | 64 +++++++++++++ mm/vmscan.c | 215 +++++++++++++++++++++++++++++++++++++++++++- 9 files changed, 290 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 8fabb03..c6f3415 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -636,6 +636,7 @@ struct address_space_operations { int (*is_partially_uptodate) (struct page *, read_descriptor_t *, unsigned long); int (*error_remove_page)(struct address_space *, struct page *); + int (*purgepage)(struct page *page, struct writeback_control *wbc); }; extern const struct address_space_operations empty_aops; diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 1397ccf..f78806c 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -91,6 +91,8 @@ static __always_inline enum lru_list page_lru(struct page *page) if (PageUnevictable(page)) lru = LRU_UNEVICTABLE; + else if (PageIsVolatile(page)) + lru = LRU_VOLATILE; else { lru = page_lru_base_type(page); if (PageActive(page)) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 68c569f..96f08bb 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -162,6 +162,7 @@ enum lru_list { LRU_ACTIVE_ANON = LRU_BASE + LRU_ACTIVE, LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE, LRU_ACTIVE_FILE = LRU_BASE + LRU_FILE + LRU_ACTIVE, + LRU_VOLATILE, LRU_UNEVICTABLE, NR_LRU_LISTS }; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index c88d2a9..57800c8 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -108,6 +108,7 @@ enum pageflags { #ifdef CONFIG_TRANSPARENT_HUGEPAGE PG_compound_lock, #endif + PG_isvolatile, __NR_PAGEFLAGS, /* Filesystems */ @@ -201,6 +202,8 @@ PAGEFLAG(Dirty, dirty) TESTSCFLAG(Dirty, dirty) __CLEARPAGEFLAG(Dirty, dirty) PAGEFLAG(LRU, lru) __CLEARPAGEFLAG(LRU, lru) PAGEFLAG(Active, active) __CLEARPAGEFLAG(Active, active) TESTCLEARFLAG(Active, active) +PAGEFLAG(IsVolatile, isvolatile) __CLEARPAGEFLAG(IsVolatile, isvolatile) + TESTCLEARFLAG(IsVolatile, isvolatile) __PAGEFLAG(Slab, slab) PAGEFLAG(Checked, checked) /* Used by some filesystems */ PAGEFLAG(Pinned, pinned) TESTSCFLAG(Pinned, pinned) /* Xen */ diff --git a/include/linux/swap.h b/include/linux/swap.h index c84ec68..eb12d53 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -236,6 +236,9 @@ extern void rotate_reclaimable_page(struct page *page); extern void deactivate_page(struct page *page); extern void swap_setup(void); +extern void mark_volatile_page(struct page *page); +extern void mark_nonvolatile_page(struct page *page); + extern void add_page_to_unevictable_list(struct page *page); /** diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f72b5e5..98e1303 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4066,6 +4066,7 @@ static const char * const mem_cgroup_lru_names[] = { "active_anon", "inactive_file", "active_file", + "volatile", "unevictable", }; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4a4f921..cffe1b6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5975,6 +5975,7 @@ static const struct trace_print_flags pageflag_names[] = { #ifdef CONFIG_TRANSPARENT_HUGEPAGE {1UL << PG_compound_lock, "compound_lock" }, #endif + {1UL << PG_isvolatile, "volatile" }, }; static void dump_page_flags(unsigned long flags) diff --git a/mm/swap.c b/mm/swap.c index 4e7e2ec..9491a9c 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -574,6 +574,70 @@ void deactivate_page(struct page *page) } } + + + + + + + +void mark_volatile_page(struct page *page) +{ + int lru; + bool active; + struct zone *zone = page_zone(page); + struct lruvec *lruvec; + + if (!PageLRU(page)) + return; + + if (PageUnevictable(page)) + return; + + active = PageActive(page); + lru = page_lru_base_type(page); + + spin_lock_irq(&zone->lru_lock); + lruvec = mem_cgroup_page_lruvec(page, zone); + del_page_from_lru_list(page, lruvec, lru + active); + add_page_to_lru_list(page, lruvec, LRU_VOLATILE); + SetPageIsVolatile(page); + ClearPageActive(page); + spin_unlock_irq(&zone->lru_lock); + + +} + + +void mark_nonvolatile_page(struct page *page) +{ + int lru; + struct zone *zone = page_zone(page); + struct lruvec *lruvec; + + if (!PageLRU(page)) + return; + + if (!PageIsVolatile(page)) + return; + + lru = page_lru_base_type(page); + + spin_lock_irq(&zone->lru_lock); + lruvec = mem_cgroup_page_lruvec(page, zone); + del_page_from_lru_list(page, lruvec, LRU_VOLATILE); + ClearPageIsVolatile(page); + SetPageActive(page); + add_page_to_lru_list(page, lruvec, lru + LRU_ACTIVE); + spin_unlock_irq(&zone->lru_lock); +} + + + + + + + void lru_add_drain(void) { lru_add_drain_cpu(get_cpu()); diff --git a/mm/vmscan.c b/mm/vmscan.c index 66e4310..682f147 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -483,7 +483,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page) if (!page_freeze_refs(page, 2)) goto cannot_free; /* note: atomic_cmpxchg in page_freeze_refs provides the smp_rmb */ - if (unlikely(PageDirty(page))) { + if (unlikely(PageDirty(page)) && !PageIsVolatile(page)) { page_unfreeze_refs(page, 2); goto cannot_free; } @@ -1190,6 +1190,212 @@ putback_inactive_pages(struct lruvec *lruvec, struct list_head *page_list) list_splice(&pages_to_free, page_list); } + + +/* + * shrink_page_list() returns the number of reclaimed pages + */ +static unsigned long shrink_volatile_page_list(struct list_head *page_list, + struct zone *zone, + struct scan_control *sc) +{ + LIST_HEAD(free_pages); + LIST_HEAD(ret_pages); + unsigned long nr_reclaimed = 0; + unsigned long nr_writeback = 0; + struct writeback_control wbc = { + .sync_mode = WB_SYNC_NONE, + .nr_to_write = SWAP_CLUSTER_MAX, + .range_start = 0, + .range_end = LLONG_MAX, + .for_reclaim = 1, + }; + + + while (!list_empty(page_list)) { + enum page_references references; + struct address_space *mapping; + struct page *page; + int may_enter_fs; + + cond_resched(); + + page = lru_to_page(page_list); + list_del(&page->lru); + + if (!trylock_page(page)) + goto keep; + + VM_BUG_ON(PageActive(page)); + VM_BUG_ON(page_zone(page) != zone); + + if (unlikely(!page_evictable(page, NULL))) + goto keep_locked; + + if (!sc->may_unmap && page_mapped(page)) + goto keep_locked; + + + may_enter_fs = (sc->gfp_mask & __GFP_FS) || + (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO)); + + if (PageWriteback(page)) { + nr_writeback++; + unlock_page(page); + goto keep; + } + + references = page_check_references(page, sc); + switch (references) { + case PAGEREF_ACTIVATE: + case PAGEREF_KEEP: + goto keep_locked; + case PAGEREF_RECLAIM: + case PAGEREF_RECLAIM_CLEAN: + ; /* try to reclaim the page below */ + } + + + mapping = page_mapping(page); + + + /* + * The page is mapped into the page tables of one or more + * processes. Try to unmap it here. + */ + if (page_mapped(page) && mapping) { + switch (try_to_unmap(page, TTU_UNMAP)) { + case SWAP_FAIL: + case SWAP_AGAIN: + case SWAP_MLOCK: + goto keep_locked; + case SWAP_SUCCESS: + ; /* try to free the page below */ + } + } + + if (PageDirty(page)) { + /* + * Only kswapd can writeback filesystem pages to + * avoid risk of stack overflow but do not writeback + * unless under significant pressure. + */ + if (page_is_file_cache(page) && + (!current_is_kswapd() || + sc->priority >= DEF_PRIORITY - 2)) { + /* + * Immediately reclaim when written back. + * Similar in principal to deactivate_page() + * except we already have the page isolated + * and know it's dirty + */ + SetPageReclaim(page); + + goto keep_locked; + } + + + if (!mapping) { + /* + * Some data journaling orphaned pages can have + * page->mapping == NULL while being dirty with clean buffers. + */ + if (page_has_private(page)) { + if (try_to_free_buffers(page)) { + ClearPageDirty(page); + printk("%s: orphaned page\n", __func__); + } + } + + } + } + + if (!mapping || !__remove_mapping(mapping, page)) + goto keep_locked; + + + + if (mapping && mapping->a_ops && mapping->a_ops->purgepage) { + mapping->a_ops->purgepage(page, &wbc); + + /* + * At this point, we have no other references and there is + * no way to pick any more up (removed from LRU, removed + * from pagecache). Can use non-atomic bitops now (and + * we obviously don't have to worry about waking up a process + * waiting on the page lock, because there are no references. + */ + __clear_page_locked(page); + + unlock_page(page); + + nr_reclaimed++; + /* + * Is there need to periodically free_page_list? It would + * appear not as the counts should be low + */ + VM_BUG_ON(PageActive(page)); + list_add(&page->lru, &free_pages); + continue; + } + + + +keep_locked: + unlock_page(page); +keep: + list_add(&page->lru, &ret_pages); + VM_BUG_ON(PageLRU(page) || PageUnevictable(page)); + } + + + + free_hot_cold_page_list(&free_pages, 1); + + list_splice(&ret_pages, page_list); + return nr_reclaimed; +} + + +static noinline_for_stack unsigned long +shrink_volatile_list(unsigned long nr_to_scan, struct lruvec *lruvec, + struct scan_control *sc) +{ + LIST_HEAD(page_list); + unsigned long nr_scanned; + unsigned long nr_reclaimed = 0; + unsigned long nr_taken; + isolate_mode_t isolate_mode = 0; + struct zone *zone = lruvec_zone(lruvec); + + + lru_add_drain(); + + if (!sc->may_unmap) + isolate_mode |= ISOLATE_UNMAPPED; + if (!sc->may_writepage) + isolate_mode |= ISOLATE_CLEAN; + + spin_lock_irq(&zone->lru_lock); + nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &page_list, + &nr_scanned, sc, isolate_mode, LRU_VOLATILE); + spin_unlock_irq(&zone->lru_lock); + + if (nr_taken == 0) + goto done; + + + nr_reclaimed = shrink_volatile_page_list(&page_list, zone, sc); + + spin_lock_irq(&zone->lru_lock); + putback_inactive_pages(lruvec, &page_list); + spin_unlock_irq(&zone->lru_lock); +done: + return nr_reclaimed; +} + + + /* * shrink_inactive_list() is a helper for shrink_zone(). It returns the number * of reclaimed pages @@ -1776,6 +1982,13 @@ restart: get_scan_count(lruvec, sc, nr); blk_start_plug(&plug); + + + nr_to_scan = min_t(unsigned long, get_lru_size(lruvec, LRU_VOLATILE), SWAP_CLUSTER_MAX); + if (nr_to_scan) + shrink_volatile_list(nr_to_scan, lruvec, sc); + + while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || nr[LRU_INACTIVE_FILE]) { for_each_evictable_lru(lru) {