From patchwork Fri Oct 30 15:57:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 317442 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03198C4741F for ; Fri, 30 Oct 2020 15:57:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 662E62083B for ; Fri, 30 Oct 2020 15:57:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=sent.com header.i=@sent.com header.b="ELu5vcLm"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="kcys9TqQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726929AbgJ3P5d (ORCPT ); Fri, 30 Oct 2020 11:57:33 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:55841 "EHLO out1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726899AbgJ3P5c (ORCPT ); Fri, 30 Oct 2020 11:57:32 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 2BD295C02E4; Fri, 30 Oct 2020 11:57:31 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Fri, 30 Oct 2020 11:57:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:reply-to:mime-version :content-type:content-transfer-encoding; s=fm1; bh=GgFywE3sedM8p RqGb2GCUrt5KozSDzUlHAh3jMIZlYc=; b=ELu5vcLmJCPbE1orsyqdaspWVSIsY 5oksyRCGM576vzsq89hLCxIBtbLPiCCrpDaFw83NsKi3/IznqGNNbiqhpTwgQhK+ Ufls4ptpwl56aGFMNH6Df/oT3Wu+08VrHoUK+yqU72y2JeCEzoNkETULRfqFXcx3 6oCb2Asj+gmC+Y2ZZCEJtz7EXKo1eNDHHRwHn7pGhXWCxaGQE+ZD1hoG7Ju8pSvM CyYf+VwStncPVVlBqmQ7GGbCb7yAEuyid1JVD9faaPMlLpPcD4tarWsAl95+voEw 4sC0+/7hjoGctlNjU0tAZLv0Mzuo0WVb3UpYE4xYxvdHSwxv+tQslMSYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:message-id:mime-version:reply-to:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=GgFywE3sedM8pRqGb2GCUrt5KozSDzUlHAh3jMIZlYc=; b=kcys9TqQ hx/6X5aclY41cAAm2+CfJLf318GOtMzVEb/zdTO0t/7En4msj8YMVxEWK5DtkKo0 p0sc/FfWfrwEvsGTXVK2R/zZhZ261K7d6pSKPJTw+H3V91x4fJxDxfo4ITgtejc7 J1vhAC2//JgIyKVxNowH5HCr2Ta1qo1s7GXVQyP+VSohKjuJryAdbiFbLaMHwEAI 9TjVqDghhW/QbkkhhJDsjd9mCJNsogaj5/ztJJR2BDoM5AsJH0QxdA03I500RqdX a4p8c2+VD+GQ1qixZkJ8OgaL9+hXQqkh4G9KUtXdLe2xaZ5MBInP2/P43rRZn6ng +KASUSrMSm1fuA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrleehgdejkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofhrgggtgfesthhqredtredtjeenucfhrhhomhepkghiucgjrghn uceoiihirdihrghnsehsvghnthdrtghomheqnecuggftrfgrthhtvghrnhepudevleffhe duuddvhfdtvdehfeekjedtleeifefhgeehjeetvdethfefvdekkeelnecukfhppeduvddr geeirddutdeirdduieegnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrg hilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 0B8B8328005E; Fri, 30 Oct 2020 11:57:30 -0400 (EDT) From: Zi Yan To: Andrew Morton , linux-mm@kvack.org Cc: Yang Shi , Michal Hocko , Vlastimil Babka , Rik van Riel , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Zi Yan Subject: [PATCH v2 1/2] mm/compaction: count pages and stop correctly during page isolation. Date: Fri, 30 Oct 2020 11:57:15 -0400 Message-Id: <20201030155716.3614401-1-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 Reply-To: Zi Yan MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Zi Yan In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages, nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped in too_many_isolated while loop, since the actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32, since we stop isolation after cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX. In addition, after we fix the issue above, cc->nr_migratepages could never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page isolation could not stop as we intended. Change the isolation stop condition to >=. The issue can be triggered as follows: In a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA region and mlocked), reserving 6 1GB hugetlb pages via /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping in too_many_isolated function) until we kill either task. With the patch applied, oom will kill the application with 10GB THPs and let hugetlb page reservation finish. Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”) Signed-off-by: Zi Yan Reviewed-by: Yang Shi Cc: --- mm/compaction.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index ee1f8439369e..3e834ac402f1 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, isolate_success: list_add(&page->lru, &cc->migratepages); - cc->nr_migratepages++; - nr_isolated++; + cc->nr_migratepages += compound_nr(page); + nr_isolated += compound_nr(page); /* * Avoid isolating too much unless this block is being @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * or a lock is contended. For contention, isolate quickly to * potentially remove one source of contention. */ - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX && + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX && !cc->rescan && !cc->contended) { ++low_pfn; break; @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, if (!pfn) break; - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX) break; }