From patchwork Tue Dec 5 09:39:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 120629 Delivered-To: patch@linaro.org Received: by 10.80.152.193 with SMTP id j59csp1951600edb; Tue, 5 Dec 2017 01:40:25 -0800 (PST) X-Google-Smtp-Source: AGs4zMaMoOZAUaotx7H3l9hlvJ8WKfAB9MhAIhD67fcilaghLDm33wK1P8k8tQOkOyZfdy7BKeIX X-Received: by 10.101.73.74 with SMTP id q10mr16795055pgs.127.1512466825257; Tue, 05 Dec 2017 01:40:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512466825; cv=none; d=google.com; s=arc-20160816; b=tuTsSQfz58kNI5jSoEtP3fW3sMhIj3ELZkRFODyoKFqRjqIkkxjAzVMi1Z9ra6KHa0 TrnUOVXYTUWWllfRKMuZyM5zoHf92/QACxGAPL6ti2ZF42f8tLqm51AoHsTnmx7M1zob UMCmaCUW5S29WvOogw9QP6S6MKmBA3q1lT5afH1QuhMkctf7LdiZFJ+72s3sAHFpCv1I /udeL9fSRPOXDrB0beg/04rfGpdl8Pl6qhn8NYDXyzlmvnDvD3vHnXyYYSoYR2GwSVz6 o6ILlv1d26L8NkkRY7fklo7OgYhCukap/5a7ORUL800kUKEqsimm9ePee9eMdsyFrUko 4dWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=egRV8vuLNOVE0VcUCMd2HuaElImN5wiObsmg+Q0hXKE=; b=u0G/j3TlRD6YIpv+x5DfYdEcsieae/WCXuWEAXlJwbl126NvpsjqL+vZyhPfTSIhq7 9i012O+omNB7QIhGmr+Hk88n2+7hoBhSjWvNBaMgCCnmQHkbDFFnLizECFBnWl3+82GK wRpzS+WrxfqSuwxo7FLTUczdzZ/ykrnDP+ed4ceB2nsLVjJdu/kQJVse1Fzl3pJsXoir oARxp0+J9Mc7pNrL+NBSfGjU26DK01O/XlVa3WDS4jNSwgh8nOU/f9OMo7qJULPACyZa WmAbGUtWoMJU1b9i2n3LjlGtQx9rl/5pFqq1GfGoWDY4YpvWUFZAzuJTbUaAgHsn5NnU GlZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l18si10754631pgc.353.2017.12.05.01.40.25; Tue, 05 Dec 2017 01:40:25 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752644AbdLEJkX (ORCPT + 10 others); Tue, 5 Dec 2017 04:40:23 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:33420 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753111AbdLEJkU (ORCPT ); Tue, 5 Dec 2017 04:40:20 -0500 Received: by mail-wm0-f67.google.com with SMTP id g130so18191834wme.0 for ; Tue, 05 Dec 2017 01:40:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=egRV8vuLNOVE0VcUCMd2HuaElImN5wiObsmg+Q0hXKE=; b=qAKJhpPRa55ZVDv9EDwPg9TlYNqyJEM3dJbkemUB/IDPEPKJc9GU9NjhP7qxgBmtt7 aqjkzPTZ3fgWuRIRWuGI5jadQYCVXUG+ntjYPmjkmVX5VY000GxVHDmUOXBAFbg4wO75 6aMz6iD9RShTpT8r5qn2q5U7w1wk7QHSC+lVHz3B/+5U80OM8Ttqe6wZL7OOBcIbGKcu nt4LG7Hm6PlmjdVoQ5TxMd9XnX8T94j1s8fJFAs/L1J4GBVXc0CGI/nIjX2VSYpa8l8F L5SMFprDvsvJFe2lECpwIfPFgx30wVflTczWvB4Yc5oVNdqxDsqnNPYOFofBs+elcs1q DeaQ== X-Gm-Message-State: AJaThX7bkwzB0ycwcHusl9IP9/WXQUMNNaEKpSVAJC8QwaLFP7qMT886 4UJ2rh2kUPLz/mmfiSFjZy8= X-Received: by 10.80.145.154 with SMTP id g26mr34915469eda.140.1512466819256; Tue, 05 Dec 2017 01:40:19 -0800 (PST) Received: from tiehlicka.suse.cz (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id r15sm10676edi.23.2017.12.05.01.40.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2017 01:40:18 -0800 (PST) From: Michal Hocko To: Greg KH Cc: wangnan0@huawei.com, aarcange@redhat.com, akpm@linux-foundation.org, guro@fb.com, khlebnikov@yandex-team.ru, liubo95@huawei.com, minchan@kernel.org, mingo@kernel.org, rientjes@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, will.deacon@arm.com, Michal Hocko Subject: [PATCH] mm, oom_reaper: gather each vma to prevent leaking TLB entry Date: Tue, 5 Dec 2017 10:39:59 +0100 Message-Id: <20171205093959.9537-1-mhocko@kernel.org> X-Mailer: git-send-email 2.15.0 In-Reply-To: <151220782284155@kroah.com> References: <151220782284155@kroah.com> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Wang Nan commit 687cb0884a714ff484d038e9190edc874edcf146 upstream. tlb_gather_mmu(&tlb, mm, 0, -1) means gathering the whole virtual memory space. In this case, tlb->fullmm is true. Some archs like arm64 doesn't flush TLB when tlb->fullmm is true: commit 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1"). Which causes leaking of tlb entries. Will clarifies his patch: "Basically, we tag each address space with an ASID (PCID on x86) which is resident in the TLB. This means we can elide TLB invalidation when pulling down a full mm because we won't ever assign that ASID to another mm without doing TLB invalidation elsewhere (which actually just nukes the whole TLB). I think that means that we could potentially not fault on a kernel uaccess, because we could hit in the TLB" There could be a window between complete_signal() sending IPI to other cores and all threads sharing this mm are really kicked off from cores. In this window, the oom reaper may calls tlb_flush_mmu_tlbonly() to flush TLB then frees pages. However, due to the above problem, the TLB entries are not really flushed on arm64. Other threads are possible to access these pages through TLB entries. Moreover, a copy_to_user() can also write to these pages without generating page fault, causes use-after-free bugs. This patch gathers each vma instead of gathering full vm space. In this case tlb->fullmm is not true. The behavior of oom reaper become similar to munmapping before do_exit, which should be safe for all archs. Link: http://lkml.kernel.org/r/20171107095453.179940-1-wangnan0@huawei.com Fixes: aac453635549 ("mm, oom: introduce oom reaper") Signed-off-by: Wang Nan Acked-by: Michal Hocko Acked-by: David Rientjes Cc: Minchan Kim Cc: Will Deacon Cc: Bob Liu Cc: Ingo Molnar Cc: Roman Gushchin Cc: Konstantin Khlebnikov Cc: Andrea Arcangeli Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [backported to 4.9 stable tree] Signed-off-by: Michal Hocko --- mm/oom_kill.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) -- 2.15.0 diff --git a/mm/oom_kill.c b/mm/oom_kill.c index d631d251c150..4a184157cc3d 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -524,7 +524,6 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm) */ set_bit(MMF_UNSTABLE, &mm->flags); - tlb_gather_mmu(&tlb, mm, 0, -1); for (vma = mm->mmap ; vma; vma = vma->vm_next) { if (is_vm_hugetlb_page(vma)) continue; @@ -546,11 +545,13 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm) * we do not want to block exit_mmap by keeping mm ref * count elevated without a good reason. */ - if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) + if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) { + tlb_gather_mmu(&tlb, mm, vma->vm_start, vma->vm_end); unmap_page_range(&tlb, vma, vma->vm_start, vma->vm_end, &details); + tlb_finish_mmu(&tlb, vma->vm_start, vma->vm_end); + } } - tlb_finish_mmu(&tlb, 0, -1); pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n", task_pid_nr(tsk), tsk->comm, K(get_mm_counter(mm, MM_ANONPAGES)),