From patchwork Tue Oct 1 18:38:46 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 20729 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-qe0-f70.google.com (mail-qe0-f70.google.com [209.85.128.70]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 79D2623920 for ; Tue, 1 Oct 2013 18:39:10 +0000 (UTC) Received: by mail-qe0-f70.google.com with SMTP id b4sf8625782qen.5 for ; Tue, 01 Oct 2013 11:39:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:x-gm-message-state:delivered-to:from:to:subject:date :message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=iH6GkbUTZuOLhZOUFkYbRGl0rIZDVvZXmqQpIYBxcMc=; b=J+GDRoBWIYMy/J4XUCJPhEfZM4WqMGt07kik33tIb/7i8oey0APFHDnYH4eNK16/6G zatNTmg0ziCcLBhKBBf9dTSFV8HcDlS8KlNlswVsxrCfufR5TL7DOowwUY/bQdTzfQMr B+anBOR8kAoVP/9OZ/6oKxOh9syD7YfM5HgqRek6CHFOfj884yYtrUj2Ga4Sb1sTlDKF QJIhzZCyb1VXwcu9Z9rHBE/hoxzLyPs0SOSXCmlvboF1Sm7M2uw3YPAt/0+1YlmynoRW u251NQlvV7wgV1Rj7qq9MPU2fc8wEEfxkR8BQ85Blno6JrWWlPkn7SKQUrmFXGTPYaHh 1Zyw== X-Received: by 10.52.64.177 with SMTP id p17mr1992852vds.3.1380652750338; Tue, 01 Oct 2013 11:39:10 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.49.25.66 with SMTP id a2ls183778qeg.95.gmail; Tue, 01 Oct 2013 11:39:10 -0700 (PDT) X-Received: by 10.59.11.69 with SMTP id eg5mr29116054ved.17.1380652750203; Tue, 01 Oct 2013 11:39:10 -0700 (PDT) Received: from mail-ve0-f176.google.com (mail-ve0-f176.google.com [209.85.128.176]) by mx.google.com with ESMTPS id 10si1614412vcr.78.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 01 Oct 2013 11:39:10 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.128.176 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.128.176; Received: by mail-ve0-f176.google.com with SMTP id jx11so5483060veb.7 for ; Tue, 01 Oct 2013 11:39:10 -0700 (PDT) X-Gm-Message-State: ALoCoQmX1uzsji0fn9+GRaTwTKvrmw9MUeHnBVTeDxvhfzvFS5vD0tc3N21+ju7KuxpwEMwm/HpG X-Received: by 10.58.196.148 with SMTP id im20mr1540050vec.28.1380652750086; Tue, 01 Oct 2013 11:39:10 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.174.196 with SMTP id u4csp38604vcz; Tue, 1 Oct 2013 11:39:09 -0700 (PDT) X-Received: by 10.67.30.164 with SMTP id kf4mr35258552pad.13.1380652748864; Tue, 01 Oct 2013 11:39:08 -0700 (PDT) Received: from mail-pd0-f177.google.com (mail-pd0-f177.google.com [209.85.192.177]) by mx.google.com with ESMTPS id ar2si5536234pbc.172.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 01 Oct 2013 11:39:05 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.192.177 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=209.85.192.177; Received: by mail-pd0-f177.google.com with SMTP id y10so7566182pdj.8 for ; Tue, 01 Oct 2013 11:39:05 -0700 (PDT) X-Received: by 10.68.226.3 with SMTP id ro3mr31250381pbc.72.1380652745196; Tue, 01 Oct 2013 11:39:05 -0700 (PDT) Received: from localhost.localdomain (c-67-170-153-23.hsd1.or.comcast.net. [67.170.153.23]) by mx.google.com with ESMTPSA id ed3sm8282606pbc.6.1969.12.31.16.00.00 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 01 Oct 2013 11:39:03 -0700 (PDT) From: John Stultz To: Minchan Kim , Dhaval Giani Subject: [PATCH 02/14] vrange: Add vrange support to mm_structs Date: Tue, 1 Oct 2013 11:38:46 -0700 Message-Id: <1380652738-8000-3-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1380652738-8000-1-git-send-email-john.stultz@linaro.org> References: <1380652738-8000-1-git-send-email-john.stultz@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: john.stultz@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.128.176 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , From: Minchan Kim This patch addes vroot on mm_struct so process can set volatile ranges on anonymous memory. This is somewhat wasteful, as it increases the mm struct even if the process doesn't use vrange syscall. So a later patch will provide dynamically allocated vroots. One of note on this patch is vrange_fork. Since we do allocations while holding a lock on the vrange, its possible it could deadlock with direct reclaim's purging logic. For this reason, vrange_fork uses GFP_NOIO for its allocations. If vrange_fork fails, it isn't a critical problem. Since the result is the child process's pages won't be won't be volatile/purgable, which could cause additional memory pressure, but won't cause problematic application behavior (since volatile pages are only purged at the kernels' discretion). This is thought to be more desirable then having fork fail. Cc: Andrew Morton Cc: Android Kernel Team Cc: Robert Love Cc: Mel Gorman Cc: Hugh Dickins Cc: Dave Hansen Cc: Rik van Riel Cc: Dmitry Adamushko Cc: Dave Chinner Cc: Neil Brown Cc: Andrea Righi Cc: Andrea Arcangeli Cc: Aneesh Kumar K.V Cc: Mike Hommey Cc: Taras Glek Cc: Dhaval Giani Cc: Jan Kara Cc: KOSAKI Motohiro Cc: Michel Lespinasse Cc: Rob Clark Cc: Minchan Kim Cc: linux-mm@kvack.org Signed-off-by: Minchan Kim [jstultz: Bit of refactoring. Comment cleanups] Signed-off-by: John Stultz --- include/linux/mm_types.h | 4 ++++ include/linux/vrange.h | 7 ++++++- kernel/fork.c | 11 +++++++++++ mm/vrange.c | 40 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 61 insertions(+), 1 deletion(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index faf4b7c..5d8cdc3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -349,6 +350,9 @@ struct mm_struct { */ +#ifdef CONFIG_MMU + struct vrange_root vroot; +#endif unsigned long hiwater_rss; /* High-watermark of RSS usage */ unsigned long hiwater_vm; /* High-water virtual memory usage */ diff --git a/include/linux/vrange.h b/include/linux/vrange.h index 0d378a5..2b96ee1 100644 --- a/include/linux/vrange.h +++ b/include/linux/vrange.h @@ -37,12 +37,17 @@ static inline int vrange_type(struct vrange *vrange) } extern void vrange_root_cleanup(struct vrange_root *vroot); - +extern int vrange_fork(struct mm_struct *new, + struct mm_struct *old); #else static inline void vrange_root_init(struct vrange_root *vroot, int type, void *object) {}; static inline void vrange_root_cleanup(struct vrange_root *vroot) {}; +static inline int vrange_fork(struct mm_struct *new, struct mm_struct *old) +{ + return 0; +} #endif #endif /* _LINIUX_VRANGE_H */ diff --git a/kernel/fork.c b/kernel/fork.c index bf46287..ceb38bf 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -71,6 +71,7 @@ #include #include #include +#include #include #include @@ -377,6 +378,14 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) retval = khugepaged_fork(mm, oldmm); if (retval) goto out; + /* + * Note: vrange_fork can fail in the case of ENOMEM, but + * this only results in the child not having any active + * volatile ranges. This is not harmful. Thus in this case + * the child will not see any pages purged unless it remarks + * them as volatile. + */ + vrange_fork(mm, oldmm); prev = NULL; for (mpnt = oldmm->mmap; mpnt; mpnt = mpnt->vm_next) { @@ -538,6 +547,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p) mm->nr_ptes = 0; memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); spin_lock_init(&mm->page_table_lock); + vrange_root_init(&mm->vroot, VRANGE_MM, mm); mm_init_aio(mm); mm_init_owner(mm, p); @@ -609,6 +619,7 @@ void mmput(struct mm_struct *mm) if (atomic_dec_and_test(&mm->mm_users)) { uprobe_clear_state(mm); + vrange_root_cleanup(&mm->vroot); exit_aio(mm); ksm_exit(mm); khugepaged_exit(mm); /* must run before exit_mmap */ diff --git a/mm/vrange.c b/mm/vrange.c index 866566c..c590198 100644 --- a/mm/vrange.c +++ b/mm/vrange.c @@ -181,3 +181,43 @@ void vrange_root_cleanup(struct vrange_root *vroot) vrange_unlock(vroot); } +/* + * It's okay to fail vrange_fork because worst case is child process + * can't have copied own vrange data structure so that pages in the + * vrange couldn't be purged. It would be better rather than failing + * fork. + */ +int vrange_fork(struct mm_struct *new_mm, struct mm_struct *old_mm) +{ + struct vrange_root *new, *old; + struct vrange *range, *new_range; + struct rb_node *next; + + new = &new_mm->vroot; + old = &old_mm->vroot; + + vrange_lock(old); + next = rb_first(&old->v_rb); + while (next) { + range = vrange_entry(next); + next = rb_next(next); + /* + * We can't use GFP_KERNEL because direct reclaim's + * purging logic on vrange could be deadlock by + * vrange_lock. + */ + new_range = __vrange_alloc(GFP_NOIO); + if (!new_range) + goto fail; + __vrange_set(new_range, range->node.start, + range->node.last, range->purged); + __vrange_add(new_range, new); + + } + vrange_unlock(old); + return 0; +fail: + vrange_unlock(old); + vrange_root_cleanup(new); + return 0; +}