From patchwork Tue Jan 19 04:39:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB569C43381 for ; Tue, 19 Jan 2021 04:45:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 705232222F for ; Tue, 19 Jan 2021 04:45:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731581AbhASEol (ORCPT ); Mon, 18 Jan 2021 23:44:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732160AbhASElK (ORCPT ); Mon, 18 Jan 2021 23:41:10 -0500 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9867BC0613D6 for ; Mon, 18 Jan 2021 20:39:26 -0800 (PST) Received: by mail-qk1-x734.google.com with SMTP id 186so20787965qkj.3 for ; Mon, 18 Jan 2021 20:39:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=7Zip2QrmKQ/Mplic3YGjgwGPQdefB544/adjAiin1/E=; b=FJM8VIvv0OUU4yIdZkGWfg8bo1z91Vm4wnb+rF+Aqdi7m++B0uERD6Ylx03QPvZUpi RLGNPtXoanfpidtqZpgsFnG4tDyY1VW2sTt0uibOXptbLmCJsuoYFzKC7Qj2PmDRiGU2 gto4YFvVExedtPn86aLUavKI2J9x6LUPZioSgzAEb1L4rJiSPJNiPJm3czeTh51s9l/a kk45Lxxa0ZnSIEYx3bUY4BtCzLBp9m4L5r0bE2mcJxmriQOsZpDjKzbwqhg8n41R833x jSro0CWsMJWTkIvEG4XIEbatykdhcM53L4q6J1sYnalXE/rzb+FMO7QeZBjpVvwbySGJ 4RGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7Zip2QrmKQ/Mplic3YGjgwGPQdefB544/adjAiin1/E=; b=jXpXa1FsSr2uoi3d1YDnteW/2+q8nTA5AOn3XAd3fUZeLaXxcz4ZtNLQjKUEy74sZJ EDhE6xDss1eUHv7L34EYayIRfJgDBYdkn2oxrOlN4Y/jdI+BqgEUB09t+NBC4+zY/fIH TAYHZsB3r9MVJtplFYdLAFkb36/AA1LMEY7ior8n6i2E02wwFNBxDR3Yy3qX/cv+VOQ8 Be7yQE6tcHGttG6qcQHQbkjBYVvt5GMASRxEeJRkOaSJzku4C1qzqrL0o+d58gOIuuuv brif8JacE54joqU9cJmV9e1uvrHy8WRPP/wWzMjbW3liS1SCY4KVSsPuOLFcM7AjmHYl vfwQ== X-Gm-Message-State: AOAM531vX1CQDbYFPIA9qwHEiEv4Gu1RYu9CKVK044sqcdMTJlBexez6 U5wTHodUMxtCbJ0DT5kzJud3hg== X-Google-Smtp-Source: ABdhPJx0jhW1FsaaMPRajeI4eKkMnvtdY6UgwPpqZSl8lUTzedh5iHTxAoJBWP1QS03gqLKsrFWgqg== X-Received: by 2002:ae9:ef12:: with SMTP id d18mr2753571qkg.473.1611031165857; Mon, 18 Jan 2021 20:39:25 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:25 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 02/14] mm/gup: check every subpage of a compound page during isolation Date: Mon, 18 Jan 2021 23:39:08 -0500 Message-Id: <20210119043920.155044-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When pages are isolated in check_and_migrate_movable_pages() we skip compound number of pages at a time. However, as Jason noted, it is not necessary correct that pages[i] corresponds to the pages that we skipped. This is because it is possible that the addresses in this range had split_huge_pmd()/split_huge_pud(), and these functions do not update the compound page metadata. The problem can be reproduced if something like this occurs: 1. User faulted huge pages. 2. split_huge_pmd() was called for some reason 3. User has unmapped some sub-pages in the range 4. User tries to longterm pin the addresses. The resulting pages[i] might end-up having pages which are not compound size page aligned. Fixes: aa712399c1e8 ("mm/gup: speed up check_and_migrate_cma_pages() on huge page") Reported-by: Jason Gunthorpe Signed-off-by: Pavel Tatashin --- mm/gup.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 24f25b1e9103..16f10d5a9eb6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1556,26 +1556,23 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, unsigned int gup_flags) { unsigned long i; - unsigned long step; bool drain_allow = true; bool migrate_allow = true; LIST_HEAD(cma_page_list); long ret = nr_pages; + struct page *prev_head, *head; struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_NOWARN, }; check_again: - for (i = 0; i < nr_pages;) { - - struct page *head = compound_head(pages[i]); - - /* - * gup may start from a tail page. Advance step by the left - * part. - */ - step = compound_nr(head) - (pages[i] - head); + prev_head = NULL; + for (i = 0; i < nr_pages; i++) { + head = compound_head(pages[i]); + if (head == prev_head) + continue; + prev_head = head; /* * If we get a page from the CMA zone, since we are going to * be pinning these entries, we might as well move them out @@ -1599,8 +1596,6 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, } } } - - i += step; } if (!list_empty(&cma_page_list)) { From patchwork Tue Jan 19 04:39:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5773C433E9 for ; Tue, 19 Jan 2021 04:45:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8FD012223E for ; Tue, 19 Jan 2021 04:45:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732064AbhASEpA (ORCPT ); Mon, 18 Jan 2021 23:45:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732171AbhASElK (ORCPT ); Mon, 18 Jan 2021 23:41:10 -0500 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C050C061793 for ; Mon, 18 Jan 2021 20:39:28 -0800 (PST) Received: by mail-qk1-x731.google.com with SMTP id d14so20742767qkc.13 for ; Mon, 18 Jan 2021 20:39:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ytfD2jv9LpK9OyoWxdpQ8h2mx1T2vBI4HkZM2c9/uoU=; b=WI9StvdGxJtwVct6bWEQzgVAdDtxsdP38appW1O1jD8gW37Wd7ooWvrKFtUmMMwH0w 5u2SvSTQafWmIdEwz4t0RRQ/iIX/DTuvcvsdR8J7ddzeLt2dn14zs49pkpLqddLpMKCP vJei4rkonnLjn5bc0xcgML+mLHZwTPO5W64qBaTast6FpCnjl0sPa5wb+Q+zAE7h7yN/ BdKIl57CPOE/pkLwAZpAQeNPkc7JGDjRcR063jHoNpbD2moaDFihEMcKySLZZak/1e/Q 4SIvCIGrE0xJs6uECf+dYDfouM6EazJ4t48SllQr0kjm39TZ55z5PDHsVXmXURK7rVIc /nrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ytfD2jv9LpK9OyoWxdpQ8h2mx1T2vBI4HkZM2c9/uoU=; b=BKnavAxsTXg377x+j6Q2r3VZF7lBl3aw5B+EPW9tmAvnUx0c00+jMxl7p/ko32dGKG YIObNaBShJhaYUSZskbidkUDFDTSS96Y2RT0BJEXs1j0w3bN2HiUEuD9MW5vrKlKev+Q DjOPIK5yvRwDRVrGXENEH/hTKGPRBhqTzeuoBrCbxGQ5ChHwTvgI3DTVfcWHUXB4cFW6 im/DUqgyKyg1/rUALtdBTqiix3qBV92bALs9K5elQzqblVyW6rgI4mHICeEI7zWJcEJ+ BdtJcGGvg82KcibLbfsDkXqmrM548wlFSe6+GN0YRxc0R4IzUb3MATtBQ4jCXVufV5vF J8iA== X-Gm-Message-State: AOAM531vuesDiHT2txbd+1MLTPf4JgwTqak69qfoRTL+TqcW8Q7p2mq8 UZbsMcPAqMXvJl3ZMVnKRGCfXg== X-Google-Smtp-Source: ABdhPJy0zYNFSWy0LJjDibtg2Y6RNr75+8j75MGHeKJmvP2Y6khyzQB5/ILqXdXtET6yi/4SJGrBdg== X-Received: by 2002:a05:620a:74e:: with SMTP id i14mr2794988qki.99.1611031167446; Mon, 18 Jan 2021 20:39:27 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:26 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 03/14] mm/gup: return an error on migration failure Date: Mon, 18 Jan 2021 23:39:09 -0500 Message-Id: <20210119043920.155044-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When migration failure occurs, we still pin pages, which means that we may pin CMA movable pages which should never be the case. Instead return an error without pinning pages when migration failure happens. No need to retry migrating, because migrate_pages() already retries 10 times. Signed-off-by: Pavel Tatashin --- mm/gup.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 16f10d5a9eb6..88ce41f41543 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1557,7 +1557,6 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, { unsigned long i; bool drain_allow = true; - bool migrate_allow = true; LIST_HEAD(cma_page_list); long ret = nr_pages; struct page *prev_head, *head; @@ -1608,17 +1607,15 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, for (i = 0; i < nr_pages; i++) put_page(pages[i]); - if (migrate_pages(&cma_page_list, alloc_migration_target, NULL, - (unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) { - /* - * some of the pages failed migration. Do get_user_pages - * without migration. - */ - migrate_allow = false; - + ret = migrate_pages(&cma_page_list, alloc_migration_target, + NULL, (unsigned long)&mtc, MIGRATE_SYNC, + MR_CONTIG_RANGE); + if (ret) { if (!list_empty(&cma_page_list)) putback_movable_pages(&cma_page_list); + return ret > 0 ? -ENOMEM : ret; } + /* * We did migrate all the pages, Try to get the page references * again migrating any new CMA pages which we failed to isolate @@ -1628,7 +1625,7 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, pages, vmas, NULL, gup_flags); - if ((ret > 0) && migrate_allow) { + if (ret > 0) { nr_pages = ret; drain_allow = true; goto check_again; From patchwork Tue Jan 19 04:39:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0AA9C433DB for ; Tue, 19 Jan 2021 05:21:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9217322ADF for ; Tue, 19 Jan 2021 05:21:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732222AbhASEpM (ORCPT ); Mon, 18 Jan 2021 23:45:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732773AbhASElt (ORCPT ); Mon, 18 Jan 2021 23:41:49 -0500 Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10C61C0617A6 for ; Mon, 18 Jan 2021 20:39:36 -0800 (PST) Received: by mail-qt1-x82c.google.com with SMTP id z9so2973151qtv.6 for ; Mon, 18 Jan 2021 20:39:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=FjN+we3a/4BMKfp2FRLzct/HukxeBsArEgb5Xvx6VvM=; b=T9ylunAntNqXbYouxPdl8dwV99Sb9aRtUlMsbQ3DW0BpmaX5STOjlICi/P/1VX+iSq kt/svqsK3J2GxzSqX5lXoqQmJP2+iNbWmPvkjVmaEHcFs6N8FiZXFLjD6tG8yc/d1266 7kGqVnRQrD7dBZfp3UmZt8oaOBRYD5OlqK6E62Jx0AqsVal4ivecdWSKp+8msVVQ7ac+ QPWCIOrOcBrSu2OHSILBYgFoCOoOq6CT0rIdYlEIlekGYvCa2Tn+1uWe+Y7t8pTMidFn WJ8rdhf71pe6Dl/7axwWPboK1r+EjkCTArUW2fpNBvq0NZp/xfZl5MRUEZi2s8WQ8ldf 0iVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FjN+we3a/4BMKfp2FRLzct/HukxeBsArEgb5Xvx6VvM=; b=gP8LqsmkyCT7Ww1+a/ox+IimmbPZ0XWL+cHMasXmj3HGd4bapQMD0lYfsuyU7WpcrJ rWI6mto7duiyckMdOHFxbaAaDIfouFKdDm6snk+EvpUhdr9GNV7MkvSwfOksOyiXDI6V j3StLpeyivPgW+s1Um2bJegmumrPn7KMrkKnUsbx//Re5YR3aanIqEKreUMGfW+EWjQL vDCgQa7XEaAJQG/LXI3e8PdXB+XVQFP+vaBSVWDLsH8Ywf25/myVPCNmjIoeK3TF0Otb 82Te3+GtLeO4LFpksZCRTa3LtChfvkwHDcG6mggsC+Z0yguSvHmN5rSvpY9rWm89S79V wcvg== X-Gm-Message-State: AOAM5337A9//QpPngMDnWyYBrjZqzwuMnySuxu0hff+No0BHnp1pyvA8 J1Mbbl0jKUSuKblWL5gINOpC+A== X-Google-Smtp-Source: ABdhPJwv/MB7KhqSh2I2lnS9blqr0Aoc1I8RzA6d5NHQd3cXRN2dgkMdOi+5gZaNcp5Ro/P8Wu0Nvw== X-Received: by 2002:ac8:5509:: with SMTP id j9mr2643575qtq.284.1611031175343; Mon, 18 Jan 2021 20:39:35 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:34 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 08/14] mm/gup: do not allow zero page for pinned pages Date: Mon, 18 Jan 2021 23:39:14 -0500 Message-Id: <20210119043920.155044-9-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Zero page should not be used for long term pinned pages. Once pages are pinned their physical addresses cannot changed until they are unpinned. Guarantee to always return real pages when they are pinned by adding FOLL_WRITE. Signed-off-by: Pavel Tatashin --- mm/gup.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index 857b273e32ac..9a817652f501 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1668,8 +1668,16 @@ static long __gup_longterm_locked(struct mm_struct *mm, unsigned long flags = 0; long rc; - if (gup_flags & FOLL_LONGTERM) + if (gup_flags & FOLL_LONGTERM) { + /* + * We are long term pinning pages and their PA's should not + * change until unpinned. Without FOLL_WRITE we might get zero + * page which we do not want. Force creating normal + * pages by adding FOLL_WRITE. + */ + gup_flags |= FOLL_WRITE; flags = memalloc_pin_save(); + } rc = __get_user_pages_locked(mm, start, nr_pages, pages, vmas, NULL, gup_flags); From patchwork Tue Jan 19 04:39:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 526F8C43332 for ; Tue, 19 Jan 2021 05:18:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 10F09221EA for ; Tue, 19 Jan 2021 05:18:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732821AbhASEpS (ORCPT ); Mon, 18 Jan 2021 23:45:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732859AbhASEmO (ORCPT ); Mon, 18 Jan 2021 23:42:14 -0500 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1C13C0617A7 for ; Mon, 18 Jan 2021 20:39:37 -0800 (PST) Received: by mail-qv1-xf36.google.com with SMTP id h1so8571925qvy.12 for ; Mon, 18 Jan 2021 20:39:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=coZhzCqMcQEa+t923qBdCiENI0AiEJAoMJuC1XDbA5Q=; b=k4dba+fVy7jLduiHIhZ2stZlAdQ2NJa1vOv+WdBmj+ut+zbKRoUr2ggbMEZYN6gw2A x4B7xJ8ua/Z6dab05AwIzBCrgVWSM1ZAIjqtZ74dvzEhDdOLkpY1lC0E8oZu5MCYhpqh 6z3RZZhsBrPpwRHGKVRf8bdzOCrMMD5I2hTEWCG2TIbFfjtHbgDo6wdmJACPkRhJYEy4 H0bnPcYyfPoXsR6QfSwJSv9HCn3b2c0rfaJZojtQW78lZXuLwR5x5V9nOi+CG9KruMLH uQfeTkZHFSxwWN4kTlHbk4hUBrrguGwFy4XQV38xYgB++5OMUfqwCIde7UYWu54iMRoC qG2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=coZhzCqMcQEa+t923qBdCiENI0AiEJAoMJuC1XDbA5Q=; b=DcNHczod4D8jXZowWh3ZNzBXlYfyF7YK1rSyJQof0sKCab68WgXsj9fGxbgFtQDQMM RmdjCX8G2BbuGTFg4pzZv8FyPLTYmvkQYkovqH4HnEVBqI43KHZaHArqitWZj8QhamzO 5U9J8Reywbs8VIm4NRnGdGxhxvbpa6SDBVJq5FV08WLddAY/nfvv6L/78rbFgofrwv8a D3ZbBjsXBddaGj7cm82xSRmmH65XhJoi6kmAeFCxycZlE7KrOhoEq7aZUMVLXEai6nVk IVX/H6qgYxb2fv8crlGshvnnEIMtOtadXUEK3gysnHdoMQWNMEi2GhC18EbGIJSykAOK PzvQ== X-Gm-Message-State: AOAM532nS54Dupgt7Oc4CAJd+qejfZyH1N+egBmOcTsaz9zG94XLjGL8 fozza0Tf7zspAoaSUQyPI+CdCw== X-Google-Smtp-Source: ABdhPJxisiNwchmoDbEkWwkWkbJDyMIOoYBmqH8v7jdhKCoGgqsAw/gw8+eHwPSdzBekzVSAAV0Dtw== X-Received: by 2002:a05:6214:1887:: with SMTP id cx7mr2889867qvb.39.1611031176896; Mon, 18 Jan 2021 20:39:36 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:36 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 09/14] mm/gup: migrate pinned pages out of movable zone Date: Mon, 18 Jan 2021 23:39:15 -0500 Message-Id: <20210119043920.155044-10-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org We should not pin pages in ZONE_MOVABLE. Currently, we do not pin only movable CMA pages. Generalize the function that migrates CMA pages to migrate all movable pages. Use is_pinnable_page() to check which pages need to be migrated Signed-off-by: Pavel Tatashin Reviewed-by: John Hubbard --- include/linux/migrate.h | 1 + include/linux/mmzone.h | 9 +++-- include/trace/events/migrate.h | 3 +- mm/gup.c | 63 ++++++++++++++-------------------- 4 files changed, 36 insertions(+), 40 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 4594838a0f7c..aae5ef0b3ba1 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -27,6 +27,7 @@ enum migrate_reason { MR_MEMPOLICY_MBIND, MR_NUMA_MISPLACED, MR_CONTIG_RANGE, + MR_LONGTERM_PIN, MR_TYPES }; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index fc99e9241846..18cf6729b5f9 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -407,8 +407,13 @@ enum zone_type { * to increase the number of THP/huge pages. Notable special cases are: * * 1. Pinned pages: (long-term) pinning of movable pages might - * essentially turn such pages unmovable. Memory offlining might - * retry a long time. + * essentially turn such pages unmovable. Therefore, we do not allow + * pinning long-term pages in ZONE_MOVABLE. When pages are pinned and + * faulted, they come from the right zone right away. However, it is + * still possible that address space already has pages in + * ZONE_MOVABLE at the time when pages are pinned (i.e. user has + * touches that memory before pinning). In such case we migrate them + * to a different zone. When migration fails - pinning fails. * 2. memblock allocations: kernelcore/movablecore setups might create * situations where ZONE_MOVABLE contains unmovable allocations * after boot. Memory offlining and allocations fail early. diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index 4d434398d64d..363b54ce104c 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -20,7 +20,8 @@ EM( MR_SYSCALL, "syscall_or_cpuset") \ EM( MR_MEMPOLICY_MBIND, "mempolicy_mbind") \ EM( MR_NUMA_MISPLACED, "numa_misplaced") \ - EMe(MR_CONTIG_RANGE, "contig_range") + EM( MR_CONTIG_RANGE, "contig_range") \ + EMe(MR_LONGTERM_PIN, "longterm_pin") /* * First define the enums in the above macros to be exported to userspace diff --git a/mm/gup.c b/mm/gup.c index 9a817652f501..c301ab060de6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -89,11 +89,12 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page, int orig_refs = refs; /* - * Can't do FOLL_LONGTERM + FOLL_PIN with CMA in the gup fast - * path, so fail and let the caller fall back to the slow path. + * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a + * right zone, so fail and let the caller fall back to the slow + * path. */ - if (unlikely(flags & FOLL_LONGTERM) && - is_migrate_cma_page(page)) + if (unlikely((flags & FOLL_LONGTERM) && + !is_pinnable_page(page))) return NULL; /* @@ -1547,17 +1548,16 @@ struct page *get_dump_page(unsigned long addr) } #endif /* CONFIG_ELF_CORE */ -#ifdef CONFIG_CMA -static long check_and_migrate_cma_pages(struct mm_struct *mm, - unsigned long start, - unsigned long nr_pages, - struct page **pages, - struct vm_area_struct **vmas, - unsigned int gup_flags) +static long check_and_migrate_movable_pages(struct mm_struct *mm, + unsigned long start, + unsigned long nr_pages, + struct page **pages, + struct vm_area_struct **vmas, + unsigned int gup_flags) { unsigned long i, isolation_error_count; bool drain_allow; - LIST_HEAD(cma_page_list); + LIST_HEAD(movable_page_list); long ret = nr_pages; struct page *prev_head, *head; struct migration_target_control mtc = { @@ -1575,13 +1575,12 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, continue; prev_head = head; /* - * If we get a page from the CMA zone, since we are going to - * be pinning these entries, we might as well move them out - * of the CMA zone if possible. + * If we get a movable page, since we are going to be pinning + * these entries, try to move them out if possible. */ - if (is_migrate_cma_page(head)) { + if (!is_pinnable_page(head)) { if (PageHuge(head)) { - if (!isolate_huge_page(head, &cma_page_list)) + if (!isolate_huge_page(head, &movable_page_list)) isolation_error_count++; } else { if (!PageLRU(head) && drain_allow) { @@ -1593,7 +1592,7 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, isolation_error_count++; continue; } - list_add_tail(&head->lru, &cma_page_list); + list_add_tail(&head->lru, &movable_page_list); mod_node_page_state(page_pgdat(head), NR_ISOLATED_ANON + page_is_file_lru(head), @@ -1606,10 +1605,10 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, * If list is empty, and no isolation errors, means that all pages are * in the correct zone. */ - if (list_empty(&cma_page_list) && !isolation_error_count) + if (list_empty(&movable_page_list) && !isolation_error_count) return ret; - if (!list_empty(&cma_page_list)) { + if (!list_empty(&movable_page_list)) { /* * drop the above get_user_pages reference. */ @@ -1619,12 +1618,12 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, for (i = 0; i < nr_pages; i++) put_page(pages[i]); - ret = migrate_pages(&cma_page_list, alloc_migration_target, + ret = migrate_pages(&movable_page_list, alloc_migration_target, NULL, (unsigned long)&mtc, MIGRATE_SYNC, - MR_CONTIG_RANGE); + MR_LONGTERM_PIN); if (ret) { - if (!list_empty(&cma_page_list)) - putback_movable_pages(&cma_page_list); + if (!list_empty(&movable_page_list)) + putback_movable_pages(&movable_page_list); return ret > 0 ? -ENOMEM : ret; } @@ -1642,17 +1641,6 @@ static long check_and_migrate_cma_pages(struct mm_struct *mm, */ goto check_again; } -#else -static long check_and_migrate_cma_pages(struct mm_struct *mm, - unsigned long start, - unsigned long nr_pages, - struct page **pages, - struct vm_area_struct **vmas, - unsigned int gup_flags) -{ - return nr_pages; -} -#endif /* CONFIG_CMA */ /* * __gup_longterm_locked() is a wrapper for __get_user_pages_locked which @@ -1684,8 +1672,9 @@ static long __gup_longterm_locked(struct mm_struct *mm, if (gup_flags & FOLL_LONGTERM) { if (rc > 0) - rc = check_and_migrate_cma_pages(mm, start, rc, pages, - vmas, gup_flags); + rc = check_and_migrate_movable_pages(mm, start, rc, + pages, vmas, + gup_flags); memalloc_pin_restore(flags); } return rc; From patchwork Tue Jan 19 04:39:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31D9FC43331 for ; Tue, 19 Jan 2021 04:45:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 10F6422241 for ; Tue, 19 Jan 2021 04:45:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732249AbhASEpO (ORCPT ); Mon, 18 Jan 2021 23:45:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732796AbhASElw (ORCPT ); Mon, 18 Jan 2021 23:41:52 -0500 Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D934C0617B1 for ; Mon, 18 Jan 2021 20:39:39 -0800 (PST) Received: by mail-qv1-xf2d.google.com with SMTP id s6so8586959qvn.6 for ; Mon, 18 Jan 2021 20:39:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=z/iYHqVeCyWybAIwBASO/JRh5H6y4Wucaqn9XTIErdQ=; b=ZzAEvUNsgm1gA/BtAGDww/ogGjjSgj1kDd613YzbM+YqhjB9Pm0Bnu+sVa84HgcfH9 d9dFvXQUB9iDOJ/KZ399mPlbIcgiuc7vWLsQkp6289MXPOfYuuIxKVB7pC9ksqAYZEZe VHJOGBBUYn8hQ6PXGRXG73CrtaC66uYplcvxFE/pdoCVeg9FjOlJmv+ly8nTlfX/BqUE WeDBMlAG2f4CKhxvhm8/maPGZ6Wgmtd20Q6uoAv/arIKIriZGzM49D/YJ/bK79FPMAps TCtqulcq7dAP54Te25gabmp7YQpFkP8ipaMEsDALpfV2a99C+f1WA8wzTSVL3L/YzM23 61hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z/iYHqVeCyWybAIwBASO/JRh5H6y4Wucaqn9XTIErdQ=; b=p/ynEKE3ANxBGaloSy48jJtNBy5NpHVwLY2pIPhqeXq5YTj+4mgEjs0GvkPhVr1prw 0PP5oAUCsSXmXRgdjIjB3A67An1n0tMFj8554/PDWgrQhfsHNE8B7mDmpMFmJUu+WfrZ Q31zaeagNp5+gBbAjBjJtOx6/VNei/Lt4CbTj5qSXmm0aBQzFN/cwGAon0F+RCjjkKQc 9JOAathuwyOajCnxV1/l6qjr8VssGNfqJzP2BbgRiu/5Sz1LgEMrrQmbdjnDtVANSx3e RFlqeO9JrtX2Ny1xyuqoxazEho7uoka1ktMn3c6lliGVz42iarC85+8d8CnmrCPyIDCg km6A== X-Gm-Message-State: AOAM5331QLGlI+FI0PNw+uo6SgNS4u3McMgDNZyao3xy4PXf27B0+v2H 2T2xRVkXpuscGq9U1fBZEERj+w== X-Google-Smtp-Source: ABdhPJx3a4TDVVpQ3OtzgpKeD4fURR/7LsDvn4tqhZ4fXd18CTYySHhA2WdtdMDk2WAbkN8BxjxaMQ== X-Received: by 2002:a0c:8027:: with SMTP id 36mr2871477qva.57.1611031178459; Mon, 18 Jan 2021 20:39:38 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:37 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 10/14] memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning Date: Mon, 18 Jan 2021 23:39:16 -0500 Message-Id: <20210119043920.155044-11-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Document the special handling of page pinning when ZONE_MOVABLE present. Signed-off-by: Pavel Tatashin Suggested-by: David Hildenbrand Acked-by: Michal Hocko --- Documentation/admin-guide/mm/memory-hotplug.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 5c4432c96c4b..c6618f99f765 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -357,6 +357,15 @@ creates ZONE_MOVABLE as following. Unfortunately, there is no information to show which memory block belongs to ZONE_MOVABLE. This is TBD. +.. note:: + Techniques that rely on long-term pinnings of memory (especially, RDMA and + vfio) are fundamentally problematic with ZONE_MOVABLE and, therefore, memory + hot remove. Pinned pages cannot reside on ZONE_MOVABLE, to guarantee that + memory can still get hot removed - be aware that pinning can fail even if + there is plenty of free memory in ZONE_MOVABLE. In addition, using + ZONE_MOVABLE might make page pinning more expensive, because pages have to be + migrated off that zone first. + .. _memory_hotplug_how_to_offline_memory: How to offline memory From patchwork Tue Jan 19 04:39:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0F79C433DB for ; Tue, 19 Jan 2021 04:46:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CEADE2222F for ; Tue, 19 Jan 2021 04:46:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733268AbhASEpW (ORCPT ); Mon, 18 Jan 2021 23:45:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387657AbhASEmU (ORCPT ); Mon, 18 Jan 2021 23:42:20 -0500 Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BE83C0617BD for ; Mon, 18 Jan 2021 20:39:42 -0800 (PST) Received: by mail-qk1-x730.google.com with SMTP id b64so20773531qkc.12 for ; Mon, 18 Jan 2021 20:39:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Mozcc4kuYbrNkd78OelIwDi34wEitfuXZj+0y++fO7I=; b=nes348fen+42chjh/uf/Hck11INwC4iFF+XsgAEA6JHbTToYZq+aKmO9bDVdE+Lrmb wcdXVwEqY0eTMoP4wHR43h/1DOgUOy/LtJm5xPD3IlHDtJprVyXbWMz6SOw9FuwIdU73 Ed+0/ijhnsSsDFNP5U1BUpB+suSRTWJyi5I6qI0duiYrXzY7K1aw59rgkLXhyAIGA37k aIpWbLk+2Jxz51n6c4+fmw0vTapPA/zdo46f/yruQukd826LxWgTaw8LL1ONuOvTtN3I 6NF9IjGvqI0+5CTIrxPtxwbGMsFpLrduIO8IYFer47jbxzbGc48HQgDgBNyDhc5TZliz 3vEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Mozcc4kuYbrNkd78OelIwDi34wEitfuXZj+0y++fO7I=; b=H7ZXtY3lmC7jfAqt+0Nqr1G9PFjkdbb04C/+t6IvNBPJ4nggYn0JnGdSzuV4LJ7zQ/ aax310fPn+IabO9L6T0cPzdm3TYVu/izUD9G5c4AyToBHmP/4S8umuBmnDtj67QQXahd scIBF7zFC4T6jB3NMI6ZieFz/D8ZSQv7pAHD5tHC/X294VMZISYyMUv4H3MyUWb1UgLl Pe4TPb2QViuT+9tEQePSk6cY3e8rEzBJvY2vpbbx55nTeCcsqngw0KVZHFVz9GuMFhMv xPNun2MtDD4V9ybNuo76KSwO8GKUvUT9DNoO/H8vN+EH9DYX8VtcoPErkLpoTUdXzGg6 5RPw== X-Gm-Message-State: AOAM532tZu2O8NmeoWoXDC9RAU4fo+eg0RPyi27JiwZgnRj/NymbssIE 8F29f3c0tFcD1xUJahitRTdHiZsWD4Aw2Q== X-Google-Smtp-Source: ABdhPJz+FNci4xuBDvj9gcDJx+dJbGOnd11aZ4ctDWNnJNtrR0jWNj8Mqh5wiA0py6oc3gJCr/ZP1Q== X-Received: by 2002:a37:e211:: with SMTP id g17mr2709283qki.298.1611031181530; Mon, 18 Jan 2021 20:39:41 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:41 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 12/14] mm/gup: longterm pin migration cleaup Date: Mon, 18 Jan 2021 23:39:18 -0500 Message-Id: <20210119043920.155044-13-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When pages are longterm pinned, we must migrated them out of movable zone. The function that migrates them has a hidden loop with goto. The loop is to retry on isolation failures, and after successful migration. Make this code better by moving this loop to the caller. Signed-off-by: Pavel Tatashin --- mm/gup.c | 101 +++++++++++++++++++++++-------------------------------- 1 file changed, 42 insertions(+), 59 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index dfe90b254bc6..3b46eb5fe3ba 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1548,27 +1548,28 @@ struct page *get_dump_page(unsigned long addr) } #endif /* CONFIG_ELF_CORE */ -static long check_and_migrate_movable_pages(struct mm_struct *mm, - unsigned long start, - unsigned long nr_pages, +/* + * Check whether all pages are pinnable, if so return number of pages. If some + * pages are not pinnable, migrate them, and unpin all pages. Return zero if + * pages were migrated, or if some pages were not successfully isolated. + * Return negative error if migration fails. + */ +static long check_and_migrate_movable_pages(unsigned long nr_pages, struct page **pages, - struct vm_area_struct **vmas, unsigned int gup_flags) { - unsigned long i, isolation_error_count; - bool drain_allow; + unsigned long i; + unsigned long isolation_error_count = 0; + bool drain_allow = true; LIST_HEAD(movable_page_list); - long ret = nr_pages; - struct page *prev_head, *head; + long ret = 0; + struct page *prev_head = NULL; + struct page *head; struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_NOWARN, }; -check_again: - prev_head = NULL; - isolation_error_count = 0; - drain_allow = true; for (i = 0; i < nr_pages; i++) { head = compound_head(pages[i]); if (head == prev_head) @@ -1606,40 +1607,23 @@ static long check_and_migrate_movable_pages(struct mm_struct *mm, * in the correct zone. */ if (list_empty(&movable_page_list) && !isolation_error_count) - return ret; + return nr_pages; + if (gup_flags & FOLL_PIN) { + unpin_user_pages(pages, nr_pages); + } else { + for (i = 0; i < nr_pages; i++) + put_page(pages[i]); + } if (!list_empty(&movable_page_list)) { - /* - * drop the above get_user_pages reference. - */ - if (gup_flags & FOLL_PIN) - unpin_user_pages(pages, nr_pages); - else - for (i = 0; i < nr_pages; i++) - put_page(pages[i]); - ret = migrate_pages(&movable_page_list, alloc_migration_target, NULL, (unsigned long)&mtc, MIGRATE_SYNC, MR_LONGTERM_PIN); - if (ret) { - if (!list_empty(&movable_page_list)) - putback_movable_pages(&movable_page_list); - return ret > 0 ? -ENOMEM : ret; - } - - /* We unpinned pages before migration, pin them again */ - ret = __get_user_pages_locked(mm, start, nr_pages, pages, vmas, - NULL, gup_flags); - if (ret <= 0) - return ret; - nr_pages = ret; + if (ret && !list_empty(&movable_page_list)) + putback_movable_pages(&movable_page_list); } - /* - * check again because pages were unpinned, and we also might have - * had isolation errors and need more pages to migrate. - */ - goto check_again; + return ret > 0 ? -ENOMEM : ret; } /* @@ -1653,30 +1637,29 @@ static long __gup_longterm_locked(struct mm_struct *mm, struct vm_area_struct **vmas, unsigned int gup_flags) { - unsigned long flags = 0; + unsigned int flags; long rc; - if (gup_flags & FOLL_LONGTERM) { - /* - * We are long term pinning pages and their PA's should not - * change until unpinned. Without FOLL_WRITE we might get zero - * page which we do not want. Force creating normal - * pages by adding FOLL_WRITE. - */ - gup_flags |= FOLL_WRITE; - flags = memalloc_pin_save(); - } + if (!(gup_flags & FOLL_LONGTERM)) + return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, + NULL, gup_flags); + /* + * We are long term pinning pages and their PA's should not change until + * unpinned. Without FOLL_WRITE we might get zero page which we do not + * want. Force creating normal pages by adding FOLL_WRITE. + */ + gup_flags |= FOLL_WRITE; + flags = memalloc_pin_save(); - rc = __get_user_pages_locked(mm, start, nr_pages, pages, vmas, NULL, - gup_flags); + do { + rc = __get_user_pages_locked(mm, start, nr_pages, pages, vmas, + NULL, gup_flags); + if (rc <= 0) + break; + rc = check_and_migrate_movable_pages(rc, pages, gup_flags); + } while (!rc); + memalloc_pin_restore(flags); - if (gup_flags & FOLL_LONGTERM) { - if (rc > 0) - rc = check_and_migrate_movable_pages(mm, start, rc, - pages, vmas, - gup_flags); - memalloc_pin_restore(flags); - } return rc; } From patchwork Tue Jan 19 04:39:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 366492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71EA5C433E0 for ; Tue, 19 Jan 2021 05:18:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3AE81221EA for ; Tue, 19 Jan 2021 05:18:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387671AbhASEp1 (ORCPT ); Mon, 18 Jan 2021 23:45:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387792AbhASEmc (ORCPT ); Mon, 18 Jan 2021 23:42:32 -0500 Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DF62C061383 for ; Mon, 18 Jan 2021 20:39:45 -0800 (PST) Received: by mail-qk1-x72e.google.com with SMTP id 186so20788416qkj.3 for ; Mon, 18 Jan 2021 20:39:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=GrD5c9rl+nII4dP015r3zSdw7qVCON9D7TFIA6KRom8=; b=bFnhOEN6Qv2S/HqctXkiMhHJAVkoPKQS8sMyCEJOO0o3KfR8IjIjYLkN83MQ/Nvvt9 OpB6h9A6EB6v9xtqZXbEachoM82WqY3pZcrSAWEPAUzFJBaV51jCmA0+kM1AkfI63QEp /VC9wUe6Epd+99hD3btoHOUJ+M5r/Jyqn7dKw9dbu8J026YxMIn3TMIjH7A/dtbQFAKa ZFWMODdfzRuMjZMRaWIT/TLXUGpJvv6Ud3SPVRCuj5ShWVfpnbKqugHOvKsGZ3fVkTDJ mh3FGaj4jqVjV0L95goZ8j7ea/wRSJ2mhgh0rCagBleHX4d1fT/8fL26PSvCYO9uAkBX YQRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GrD5c9rl+nII4dP015r3zSdw7qVCON9D7TFIA6KRom8=; b=j7+Gb0xEumW5PPsAJsFDNgFTTatXc3T5XlM0SMljBaHz55A4Bfyjm+YXGg5m4Nqq4e pY4/3Iy7A2qRp/HySYRu+6a5eOan9sjWUQ+feEPyoSkN8vefX4dp6sYZxYSCMO9kQom+ BNTaIFnQSMDikxO5Es0/jqKQvl5GDL7ikbrJybJljQJ2QtxENowYkOY29RlEN9rXxt5R zi9uGqDJvtIqFlVvbkLQ+qyqD+eigJToJZd80cP0L/RL8DYOtFXNlludE7TjUThygtcb NRrXxmqnsb1umByjvzKKpShc15M8t88oMB/mMwnXeyd4ISnYqhHZH9WF/En9IDkdufj6 wYng== X-Gm-Message-State: AOAM530B2QLtas+Tj0RiZI87Tk0jDIRgbGggeSlofd+XdlF0IeYjcvTp 76A65+3mNbL0ZWd5WH44UyNW5w== X-Google-Smtp-Source: ABdhPJzIJnuNSk1VeOUOLD2b3+EgzyuqPhHZjyj95Mv5V3QwSOIYNgtx5B7jN8TQadRhwc3DX6/ONw== X-Received: by 2002:a05:620a:46:: with SMTP id t6mr2751756qkt.108.1611031184638; Mon, 18 Jan 2021 20:39:44 -0800 (PST) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id z20sm11934536qkz.37.2021.01.18.20.39.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 20:39:44 -0800 (PST) From: Pavel Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, osalvador@suse.de, dan.j.williams@intel.com, sashal@kernel.org, tyhicks@linux.microsoft.com, iamjoonsoo.kim@lge.com, mike.kravetz@oracle.com, rostedt@goodmis.org, mingo@redhat.com, jgg@ziepe.ca, peterz@infradead.org, mgorman@suse.de, willy@infradead.org, rientjes@google.com, jhubbard@nvidia.com, linux-doc@vger.kernel.org, ira.weiny@intel.com, linux-kselftest@vger.kernel.org Subject: [PATCH v5 14/14] selftests/vm: test faulting in kernel, and verify pinnable pages Date: Mon, 18 Jan 2021 23:39:20 -0500 Message-Id: <20210119043920.155044-15-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210119043920.155044-1-pasha.tatashin@soleen.com> References: <20210119043920.155044-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When pages are pinned they can be faulted in userland and migrated, and they can be faulted right in kernel without migration. In either case, the pinned pages must end-up being pinnable (not movable). Add a new test to gup_test, to help verify that the gup/pup (get_user_pages() / pin_user_pages()) behavior with respect to pinnable and movable pages is reasonable and correct. Specifically, provide a way to: 1) Verify that only "pinnable" pages are pinned. This is checked automatically for you. 2) Verify that gup/pup performance is reasonable. This requires comparing benchmarks between doing gup/pup on pages that have been pre-faulted in from user space, vs. doing gup/pup on pages that are not faulted in until gup/pup time (via FOLL_TOUCH). This decision is controlled with the new -z command line option. Signed-off-by: Pavel Tatashin --- mm/gup_test.c | 6 ++++++ tools/testing/selftests/vm/gup_test.c | 23 +++++++++++++++++++---- 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/gup_test.c b/mm/gup_test.c index a6ed1c877679..d974dec19e1c 100644 --- a/mm/gup_test.c +++ b/mm/gup_test.c @@ -52,6 +52,12 @@ static void verify_dma_pinned(unsigned int cmd, struct page **pages, dump_page(page, "gup_test failure"); break; + } else if (cmd == PIN_LONGTERM_BENCHMARK && + WARN(!is_pinnable_page(page), + "pages[%lu] is NOT pinnable but pinned\n", + i)) { + dump_page(page, "gup_test failure"); + break; } } break; diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c index 943cc2608dc2..1e662d59c502 100644 --- a/tools/testing/selftests/vm/gup_test.c +++ b/tools/testing/selftests/vm/gup_test.c @@ -13,6 +13,7 @@ /* Just the flags we need, copied from mm.h: */ #define FOLL_WRITE 0x01 /* check pte is writable */ +#define FOLL_TOUCH 0x02 /* mark page accessed */ static char *cmd_to_str(unsigned long cmd) { @@ -39,11 +40,11 @@ int main(int argc, char **argv) unsigned long size = 128 * MB; int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 1; unsigned long cmd = GUP_FAST_BENCHMARK; - int flags = MAP_PRIVATE; + int flags = MAP_PRIVATE, touch = 0; char *file = "/dev/zero"; char *p; - while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHp")) != -1) { + while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHpz")) != -1) { switch (opt) { case 'a': cmd = PIN_FAST_BENCHMARK; @@ -110,6 +111,10 @@ int main(int argc, char **argv) case 'H': flags |= (MAP_HUGETLB | MAP_ANONYMOUS); break; + case 'z': + /* fault pages in gup, do not fault in userland */ + touch = 1; + break; default: return -1; } @@ -167,8 +172,18 @@ int main(int argc, char **argv) else if (thp == 0) madvise(p, size, MADV_NOHUGEPAGE); - for (; (unsigned long)p < gup.addr + size; p += PAGE_SIZE) - p[0] = 0; + /* + * FOLL_TOUCH, in gup_test, is used as an either/or case: either + * fault pages in from the kernel via FOLL_TOUCH, or fault them + * in here, from user space. This allows comparison of performance + * between those two cases. + */ + if (touch) { + gup.gup_flags |= FOLL_TOUCH; + } else { + for (; (unsigned long)p < gup.addr + size; p += PAGE_SIZE) + p[0] = 0; + } /* Only report timing information on the *_BENCHMARK commands: */ if ((cmd == PIN_FAST_BENCHMARK) || (cmd == GUP_FAST_BENCHMARK) ||