From patchwork Tue Mar 18 16:18:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874489 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90F8220DD59 for ; Tue, 18 Mar 2025 16:18:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314714; cv=none; b=NPKzPJ9vrtMDGGLBwJDxFtkLi1Li4Y+nC9tN/HUQNXv/+iV/v54+QboUWN3OHHInvQRnFkpJBNK3dUf7gH+EfTH5XaxQTEyJW+rKqoaEFS1OQ9OzKRNWQMIhH2G+5uhkatcU0sVA5QoPkIq0FmD4tOXHr5Xj+zOYZVth8F+KaEI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314714; c=relaxed/simple; bh=sjo7jmQc7al/yL/P26xDJfaRUQWhfD/K1qNaiuw/aqk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BvsIeWvkAauQ7YUbqO7p1bnIfi3SXnL1d7eJp19z133vUkkcT9xRnB/hM7LmwjFBekd0fQKuRHaES5T2mxYvXGUwOb9/JT6MaSEwZPnbePivkjZM/mIWw0l962x9e1iPOFQq7bhec9cEvX8cWVbq3CSzhfa1aT5LJkG9rpzC+oU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BJB2eNp7; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BJB2eNp7" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43d0830c3f7so30339255e9.2 for ; Tue, 18 Mar 2025 09:18:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314707; x=1742919507; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RUG6NztbLFLzDS9xNltsNQX8q/i/r4KDCgmt5mBPitc=; b=BJB2eNp7gdwjJUx3LhWqGIVsY5PL+0bFWkt6HoqcR/8qy9yRe3JOMbqRS073CRtJur 0k4ZhE6NR9xvD1O5VNI7eKHzwL5qV3Y5TcM+dy1MyiT/Xgt4wkBh2ltpFH4ofILiw92q QbpjEqFmpw/stawHDOyac+pL/7rgD+oWbOgswBiDFzz7HOnlPxao+sdVZfeXhqh/TbwS 8q9dbLfpD4J3i8aMgN8gEsU2a7HwmQ1E4egGrq1oGH/XSuFEaXopNBFYeh9AuAHVTBWl NTumth5bsSOwhEO/TVnK+jAW6oy+IUjQYAMoJ5KsAuHt7cVh8laHnMmFUgCy626MA8+U SBcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314707; x=1742919507; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RUG6NztbLFLzDS9xNltsNQX8q/i/r4KDCgmt5mBPitc=; b=vbjhS+C6qcFzG1bwY7Ih9lrplX7lwjY7mIW0l9JxcvQm/4B1awZXcqfJvliJuCFRBa 6N1vefboD/By+3Fzc2GfEavAr85LPdaIUSExlje+NLE4FhRBox8Z4pFMiRoiyy2xkric AnbclYYX8bVcY6J3e2K6dfEU5H7WAhqa7KSvTc9phy3hI7JCDiAAQ5DBLV7ruTFs6Xtf 1Oe24xcCz8614vit2kZeAr+HNhEAZ2p8tOR1z25tFlIuy23CxfuxFDztUOskFZOzlwLv 0/D29OLqY4NbcgKW5tFNaCZOgGW8qUBN7RHkf0axpufIdpWm+mekcgQHHeksE4KBYcUp gqCQ== X-Forwarded-Encrypted: i=1; AJvYcCVY4LftAx81+RMjJbjGezl11RLErFQ5Hnwnj9QGx49nGC9M4fbYPiDwQjzc+ImHiUZ7yfY1KrfGqzw4z47d@vger.kernel.org X-Gm-Message-State: AOJu0YwLimLTWRqrk0APiMns7BREgrk9bWwV+ceYFvgN1ST//58Jmjg8 ia+jf3o5N6g5FAMk04N2b37cdgmuffoBInsxFTfrRqARqU4lw1WREuPPldloLw3/alBmFMFRTw= = X-Google-Smtp-Source: AGHT+IGA4buFW8D/i7of1JJ/CaW7suQpBbUbSZuYhIOMvxS1NT2RYw+WVnJcFCNBlIIk3X2ZUVPHPmOwRw== X-Received: from wmgg28.prod.google.com ([2002:a05:600d:1c:b0:43c:fcbd:f2eb]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:468e:b0:43c:f81d:f with SMTP id 5b1f17b1804b1-43d3b986350mr25234125e9.8.1742314706944; Tue, 18 Mar 2025 09:18:26 -0700 (PDT) Date: Tue, 18 Mar 2025 16:18:15 +0000 In-Reply-To: <20250318161823.4005529-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318161823.4005529-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318161823.4005529-2-tabba@google.com> Subject: [PATCH v7 1/9] mm: Consolidate freeing of typed folios on final folio_put() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Some folio types, such as hugetlb, handle freeing their own folios. Moreover, guest_memfd will require being notified once a folio's reference count reaches 0 to facilitate shared to private folio conversion, without the folio actually being freed at that point. As a first step towards that, this patch consolidates freeing folios that have a type. The first user is hugetlb folios. Later in this patch series, guest_memfd will become the second user of this. Suggested-by: David Hildenbrand Acked-by: Vlastimil Babka Acked-by: David Hildenbrand Signed-off-by: Fuad Tabba --- include/linux/page-flags.h | 15 +++++++++++++++ mm/swap.c | 23 ++++++++++++++++++----- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 36d283552f80..6dc2494bd002 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -953,6 +953,21 @@ static inline bool page_has_type(const struct page *page) return page_mapcount_is_type(data_race(page->page_type)); } +static inline int page_get_type(const struct page *page) +{ + return page->page_type >> 24; +} + +static inline bool folio_has_type(const struct folio *folio) +{ + return page_has_type(&folio->page); +} + +static inline int folio_get_type(const struct folio *folio) +{ + return page_get_type(&folio->page); +} + #define FOLIO_TYPE_OPS(lname, fname) \ static __always_inline bool folio_test_##fname(const struct folio *folio) \ { \ diff --git a/mm/swap.c b/mm/swap.c index fc8281ef4241..47bc1bb919cc 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -94,6 +94,19 @@ static void page_cache_release(struct folio *folio) unlock_page_lruvec_irqrestore(lruvec, flags); } +static void free_typed_folio(struct folio *folio) +{ + switch (folio_get_type(folio)) { +#ifdef CONFIG_HUGETLBFS + case PGTY_hugetlb: + free_huge_folio(folio); + return; +#endif + default: + WARN_ON_ONCE(1); + } +} + void __folio_put(struct folio *folio) { if (unlikely(folio_is_zone_device(folio))) { @@ -101,8 +114,8 @@ void __folio_put(struct folio *folio) return; } - if (folio_test_hugetlb(folio)) { - free_huge_folio(folio); + if (unlikely(folio_has_type(folio))) { + free_typed_folio(folio); return; } @@ -966,13 +979,13 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) if (!folio_ref_sub_and_test(folio, nr_refs)) continue; - /* hugetlb has its own memcg */ - if (folio_test_hugetlb(folio)) { + if (unlikely(folio_has_type(folio))) { + /* typed folios have their own memcg, if any */ if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - free_huge_folio(folio); + free_typed_folio(folio); continue; } folio_unqueue_deferred_split(folio); From patchwork Tue Mar 18 16:18:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874488 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 761E520E32A for ; Tue, 18 Mar 2025 16:18:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314716; cv=none; b=D1jtlxGqUmSNiz6FrEfN6/zaijQzgbjks1X/S/1Qfa4R4XKs4ZnO0/XHGJgtMBUDCHRv6oUPTj0FeQa/v6dWd+YcDCQGzux59MAXqZUxplgGREFkeKPr1VIWk9gmHrr/KGDcCTMYmfSzUHBK+5xLCaEU4lCRsN9pl7FVE8qjKTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314716; c=relaxed/simple; bh=yGIyXmpROQXRZi4dKEjioa5P2A7mpPxTdx1fHqRVjvk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=NNXSmbWYFHwVbsaQxJEP1IFGILTAZogavY8gCLCMfCLpjZ7x/WBBtjBjbTIrjREgL/eZUm48Kz0Afn/Vp/nfktmBx7uQUR7Ljyhx8/7qqGsCnw0XRjzJU2bYSLM4W9u+hCjdFg/WiYUpEcMX9kNSu4b5s0M87zjTMEmx87UbQ/4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fL0CIlCh; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fL0CIlCh" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-3912a0439afso2631406f8f.3 for ; Tue, 18 Mar 2025 09:18:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314709; x=1742919509; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=f04fIvPqKRtI5t7f5B+Wihfsf/K5ZsIQslP/ug4gI5c=; b=fL0CIlCh/MOUwCG3OhBFuzospc9Ogdx/lERQzYDvnKzJCb3tUm29UL5wYaRqNu4CuZ FAn1KnvGf/DOG2yWOtF2VRn3dNFrDrY4KjWH2tyAWO6mFxfPSe5hBGqTQoniXegyULkr RNUFvhDJ3NVBeap6Eb4C0siXr0F3Arnsmes0B7lOtuD5hJehsGo1nNmBmjfFQvLydC1T I+TyDDljREOk6dULtlLP99ImjwOMAKlJvVh4WfotngQsc276VeTpRbzPHSSxbqksbMnB 3RmxD6MmK0EjLb1szk7tVM/u/F62fJwasI/esg5wV5+net7Ca9csMlA7EX4jM5rWdqxt AZLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314709; x=1742919509; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=f04fIvPqKRtI5t7f5B+Wihfsf/K5ZsIQslP/ug4gI5c=; b=pKsNudaOp3OKCqnZ3aJO7KR8cIg7nGexzJUmlUvRShS7+LX3t2k7S6gm5AO5dpacEQ BXJmxeQv6bDuxDhDcXvEENbySjfGhKVaoPrIcdzZsFfQQ08UqI0v1bfE3OA0h+mx/AYx tXV3dL+perEF8Ffhdzt+r6eFWouZJILboL/DtY5NzmdB+kOnMkh8RTi9mCVSjJFkSUl3 U1L5SVYXF15r9Z0JzLbvHA2YCjnuMbhlIcuY9smdULSfFDulFXLX+X91zrqeXAzVL38s fKMw9WBHvlWQW893rKRayjUPX8jRv0sjZhMb8ghIlx4EZBTesqG85bSoMZXKqNzFvNV0 0CKA== X-Forwarded-Encrypted: i=1; AJvYcCWi1pkZJjxIMatGL5YEXy7VzMGz+GpBakHZMd5mCpaDuYkss6xBg9nUAEYa2ty7IhVsrusXgnXYYPpIw4h6@vger.kernel.org X-Gm-Message-State: AOJu0Yz34ENb7mRS3LJGvRbuwOZjAkHhv2DzEUkRjFNOHQLdKI0zbjem YXIXA1F+rjH2Tvrxko1mJBPXVfmGaGShx4ZAXw7p58lb0YH5cgltBqfaBH+4KU3k09FwGaHT/w= = X-Google-Smtp-Source: AGHT+IHKjyvE9aXNEsQehi7vFGlwHXvPcmdq406eUxRhHnas0MMzqCwbpt157B52I+uti71hU+p9WZ7i0w== X-Received: from wmgg6.prod.google.com ([2002:a05:600d:6:b0:43c:fc61:959c]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:6d81:0:b0:397:8ef9:9952 with SMTP id ffacd0b85a97d-3996b499a00mr4539610f8f.38.1742314708814; Tue, 18 Mar 2025 09:18:28 -0700 (PDT) Date: Tue, 18 Mar 2025 16:18:16 +0000 In-Reply-To: <20250318161823.4005529-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318161823.4005529-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318161823.4005529-3-tabba@google.com> Subject: [PATCH v7 2/9] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Before transitioning a guest_memfd folio to unshared, thereby disallowing access by the host and allowing the hypervisor to transition its view of the guest page as private, we need to be sure that the host doesn't have any references to the folio. This patch introduces a new type for guest_memfd folios, which isn't activated in this series but is here as a placeholder and to facilitate the code in the subsequent patch series. This will be used in the future to register a callback that informs the guest_memfd subsystem when the last reference is dropped, therefore knowing that the host doesn't have any remaining references. This patch also introduces the configuration option, KVM_GMEM_SHARED_MEM, which toggles support for mapping guest_memfd shared memory at the host. Signed-off-by: Fuad Tabba Acked-by: Vlastimil Babka Acked-by: David Hildenbrand --- include/linux/kvm_host.h | 4 ++++ include/linux/page-flags.h | 16 ++++++++++++++++ mm/debug.c | 1 + mm/swap.c | 29 +++++++++++++++++++++++++++++ virt/kvm/Kconfig | 4 ++++ virt/kvm/guest_memfd.c | 8 ++++++++ 6 files changed, 62 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f34f4cfaa513..3ad0719bfc4f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2571,4 +2571,8 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range); #endif +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +void kvm_gmem_handle_folio_put(struct folio *folio); +#endif + #endif diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 6dc2494bd002..daeee9a38e4c 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -933,6 +933,7 @@ enum pagetype { PGTY_slab = 0xf5, PGTY_zsmalloc = 0xf6, PGTY_unaccepted = 0xf7, + PGTY_guestmem = 0xf8, PGTY_mapcount_underflow = 0xff }; @@ -1082,6 +1083,21 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb) FOLIO_TEST_FLAG_FALSE(hugetlb) #endif +/* + * guestmem folios are used to back VM memory as managed by guest_memfd. Once + * the last reference is put, instead of freeing these folios back to the page + * allocator, they are returned to guest_memfd. + * + * For now, guestmem will only be set on these folios as long as they cannot be + * mapped to user space ("private state"), with the plan of always setting that + * type once typed folios can be mapped to user space cleanly. + */ +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +FOLIO_TYPE_OPS(guestmem, guestmem) +#else +FOLIO_TEST_FLAG_FALSE(guestmem) +#endif + PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc) /* diff --git a/mm/debug.c b/mm/debug.c index 8d2acf432385..08bc42c6cba8 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -56,6 +56,7 @@ static const char *page_type_names[] = { DEF_PAGETYPE_NAME(table), DEF_PAGETYPE_NAME(buddy), DEF_PAGETYPE_NAME(unaccepted), + DEF_PAGETYPE_NAME(guestmem), }; static const char *page_type_name(unsigned int page_type) diff --git a/mm/swap.c b/mm/swap.c index 47bc1bb919cc..d8fda3948684 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -38,6 +38,10 @@ #include #include +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +#include +#endif + #include "internal.h" #define CREATE_TRACE_POINTS @@ -94,6 +98,26 @@ static void page_cache_release(struct folio *folio) unlock_page_lruvec_irqrestore(lruvec, flags); } +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +static void gmem_folio_put(struct folio *folio) +{ + /* + * Perform the callback only as long as the KVM module is still loaded. + * As long as the folio mapping is set, the folio is associated with a + * guest_memfd inode. + */ + if (folio->mapping) + kvm_gmem_handle_folio_put(folio); + + /* + * If there are no references to the folio left, it's not associated + * with a guest_memfd inode anymore. + */ + if (folio_ref_count(folio) == 0) + __folio_put(folio); +} +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ + static void free_typed_folio(struct folio *folio) { switch (folio_get_type(folio)) { @@ -101,6 +125,11 @@ static void free_typed_folio(struct folio *folio) case PGTY_hugetlb: free_huge_folio(folio); return; +#endif +#ifdef CONFIG_KVM_GMEM_SHARED_MEM + case PGTY_guestmem: + gmem_folio_put(folio); + return; #endif default: WARN_ON_ONCE(1); diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 54e959e7d68f..4e759e8020c5 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_GMEM_SHARED_MEM + select KVM_PRIVATE_MEM + bool diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index b2aa6bf24d3a..5fc414becae5 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -13,6 +13,14 @@ struct kvm_gmem { struct list_head entry; }; +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +void kvm_gmem_handle_folio_put(struct folio *folio) +{ + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); +} +EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. From patchwork Tue Mar 18 16:18:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874487 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 904A02116F7 for ; Tue, 18 Mar 2025 16:18:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314720; cv=none; b=cwjg31ob77LZoE4/6RPpnDFroEdOsbfVtcw3kNVmoM42TiX/zSxXP1kAmpA+AIcNVefQ5IxIRBDVbGuREhiDztiwf4WU/ZJQDyfTe1navUIsk6Vd1sGoVVyMH4kkZkfjtIA14D7cyl+AgLqvSkEIieBsMcgjCH5sUlvsxiY6QMI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314720; c=relaxed/simple; bh=umxu8C2NoBOTv0LQ/dw87xuKmoh149PdPjPT+CmuSyY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=k016C/wC8+wE4jwQ4X/CFFkznAlcgLLnZaXpUPhRTQdI1riz0qJNmVUUHst8W9+hMMCSNPUfHzJR5VZRuyOEZzQCtD5+YsikV13b2GQoyedVle1LO5U+6/jdF8hSO6rvY1gpPiuajNzB5E69kcBkv5POh9AImtyZ5g5c7Xdc6GM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NFnh9HHL; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NFnh9HHL" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43cf172ff63so22378795e9.3 for ; Tue, 18 Mar 2025 09:18:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314717; x=1742919517; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DD7dn7Ly/beG6AZrtqzXuPgQ+MQFh8NjMZMFS4hSOZA=; b=NFnh9HHLyxmcPb8PFVAoBQDN785yhgj8FBNeHV/C2p4/bnIFl/+1dX9qvZ5iEmAWU6 TUEFamdjiHtVt2GNNVOq87eAg3F97sgNssoR8AKwqSwzRSdTR1T3u8DDabU4+0k0jr53 e6tRPXoYMSPVpn0FzYsuzNntokUahVMFufIAG60+LfBw1521e8cY5XEEqjhmQXtf/LUS HKXtORdXcgfc7J+7yhdklQxrIobaLPqTIOFNzHWd1g7Ch0WFQtlmlu4hAfURI+gRty68 qIgR/lsAf5ueopfnpW2R0IrOVMnVVjgmOJY0cdYL09qRBIqDT3AEPIWS7/iSgFm8+UWx FADQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314717; x=1742919517; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DD7dn7Ly/beG6AZrtqzXuPgQ+MQFh8NjMZMFS4hSOZA=; b=v+Zz19XTktuIRU87Bi/FepX/MifDeok9ZFRb0by84YA1FDfjfjeqtJJFp4EW7TD5j1 kZMKBj2/5mE4TYwbXXCi8bQehqkRibZ2WL84uzNm+hdJKQOYy9YgsScpMxbKywC+bPVY 4LQcPwavLyx/qsRoAahH6PlhhrVqHUu/Ak1thOj0u/rJOSzGO1d9y+GI1jPUToIajdKD O480YyZT4npsadKlFrh+2EgKAciAAv7nLg5TDdQSClekZ1+WHrMn9jUSk3+Q7lqQCclh RsE7GG9gsIi9WXHNLVvugne06p2dmJRuvr8b5p0Y5LkvEQ0fejD+SbgopJu/hUKKNlNf fPvw== X-Forwarded-Encrypted: i=1; AJvYcCW5gP2yLZBjh82pmHcfZl5NIu3SnhyicmpVmVlFy82uGZPQxR+QEKqtyAaV0MoCRM8NSm3v3DPWHk8a+EE2@vger.kernel.org X-Gm-Message-State: AOJu0YyJsoosReg9dtM5jGos11M3/XJWzEBqdpJml15mHI5qmrs54eO6 tMJAQ37DP9BesZ1AvsjdCFXAglVxCNJcdoMPO9ROHi5JSCNW7dDgEaQFFIvvXeM/swf913MRmQ= = X-Google-Smtp-Source: AGHT+IE7EIJMDoVEqpxWR587eRp2r8+Ir2MgzNO4zB8DTw78LEPUQteJjG5XQUYQcTjpzPemxYUdDq36BA== X-Received: from wmcn4.prod.google.com ([2002:a05:600c:c0c4:b0:43c:fae1:8125]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:310e:b0:43c:e478:889 with SMTP id 5b1f17b1804b1-43d3b7c9e1bmr28562425e9.0.1742314716949; Tue, 18 Mar 2025 09:18:36 -0700 (PDT) Date: Tue, 18 Mar 2025 16:18:20 +0000 In-Reply-To: <20250318161823.4005529-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318161823.4005529-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318161823.4005529-7-tabba@google.com> Subject: [PATCH v7 6/9] KVM: arm64: Refactor user_mem_abort() calculation of force_pte From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com To simplify the code and to make the assumptions clearer, refactor user_mem_abort() by immediately setting force_pte to true if the conditions are met. Also, remove the comment about logging_active being guaranteed to never be true for VM_PFNMAP memslots, since it's not technically correct right now. No functional change intended. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 1f55b0c7b11d..887ffa1f5b14 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1460,7 +1460,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, bool fault_is_perm) { int ret = 0; - bool write_fault, writable, force_pte = false; + bool write_fault, writable; bool exec_fault, mte_allowed; bool device = false, vfio_allow_any_uc = false; unsigned long mmu_seq; @@ -1472,6 +1472,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, gfn_t gfn; kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); + bool force_pte = logging_active || is_protected_kvm_enabled(); long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; @@ -1521,16 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } - /* - * logging_active is guaranteed to never be true for VM_PFNMAP - * memslots. - */ - if (logging_active || is_protected_kvm_enabled()) { - force_pte = true; + if (force_pte) vma_shift = PAGE_SHIFT; - } else { + else vma_shift = get_vma_page_shift(vma, hva); - } switch (vma_shift) { #ifndef __PAGETABLE_PMD_FOLDED From patchwork Tue Mar 18 16:18:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874486 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0E7C210F56 for ; Tue, 18 Mar 2025 16:18:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314723; cv=none; b=NO6i8qTQmUcbx9ybJTe5Whg+/EUvd+1c2EUjh/eNS17+ovPu3k0s1Jh5lPDYFfMin5NSSo4enPvZzhxj2xRAArwB2WHttASpP9BYQr/0q2/p1tuZtmBHdgy+ImmYcT13pFSK77Sv2sTmGuBoSQ0KEAa9f9wSDFjha37qww+KuB0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314723; c=relaxed/simple; bh=fNwQxyKCkeTKpnss6ffCXIyS3obBtlZhCwMKWqunBjI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=fUjjEdDG+cD5vi6Bzd6AilFRGldBX/EKq4vfHw4LrDJ9eBIyaiPwTPIWITCxu24t5EtyG68na8vHvVkfCRlWcRtzMSv3hWC5bVkDEsVy6JE+uLcMWB8wZABr4aqVrivmTQxycoN2Sy1oob9kOvX1TG/lTsuzERwzpYI2Sh8pNno= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=M/COFTEI; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="M/COFTEI" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43947a0919aso24724735e9.0 for ; Tue, 18 Mar 2025 09:18:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314719; x=1742919519; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JkKRvohaR/R6uYAMTdXb2czEM6XwK4bQo507HmHpTIg=; b=M/COFTEIVoT9fgmcWuioEzZAViAROE78TT2m3KQNVCf7bA0v3LCIB47eg6j6gNBuG9 b8Q0c9eDikgstTH6CSAUfxgii55lXE5UQslo44eckkub+l9ENCFernpkYWGakP2I/0rs mg/Mun3CllloXAEznwF2gd1D70S2dGafAQQbBLo2RdRFIPNptME8x5vRtOdvt57rajeu QDw/uR73UPVLeEwaJ5fGSpzaaTfFhp+AqXpPUnE3f5cWJH7RVbOkcZmcDL/17g/IZwbC J1DFIFEgWTUbMCfpezmbLIZeon+9yrHYSMNVRji4zpOE+ricipa2M4w1S6fRMe7x3o9F bePw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314719; x=1742919519; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JkKRvohaR/R6uYAMTdXb2czEM6XwK4bQo507HmHpTIg=; b=PFKMHiF5J1f0KoDKGv/zIex+bEyLt8CM7SSQp5443Cdh+p3RPMUngNnLxjLKq9Sq5B XbsEKTqHOo/8mDDAaD4Upi6Ee86Bw7WxbMyE1NHkgq8jJ/WO5QXvEF7OkY4nlt0mNWb0 dAb+lrYuI6V8ZifXeuHkc0V7qXb17QGdrCCgOKWthr53rhEFtLRLwKPa6TE4BAIJPRpH k7JPUUUgwvsQ6YZALFaHO8X8F59TxI1tWJhdfN557S/E2xH3pL/TSa8rJWOpjBii6Ttv dV6BoLgIwz8qOeT9FPdIweM0X1KU6clWbAUR2UNhkoAZ5EPCZv5Tix4q7IQ2MVum0+Xw jS7g== X-Forwarded-Encrypted: i=1; AJvYcCWv3sX04MrLH7PHlP4g4HJDVQWrpyDI39axqrxhNtmUGZOAkqb0QvmOXE/fGi9tF+fRv9JByERUOi2hrJ06@vger.kernel.org X-Gm-Message-State: AOJu0Yz/a88x1hOIFPua/kUG0gbDibE5KPLNNjbePbA0usiuw8PI2exb oHTbBKsbwhnFG5O6MofnOwqyQx4SNUOyZKE0oZ2sYqidlP7y+uc1H0net5HNaTv4ay28B7Fcyw= = X-Google-Smtp-Source: AGHT+IHYroJwqJU3DOFD6+1icl1H3re3zIaW22AGeBYXA+tL6iWDUUjy9KrUK9oxHqw8vDAQL5StNoyn7w== X-Received: from wmbbe15.prod.google.com ([2002:a05:600c:1e8f:b0:43d:1d5:26e6]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:354a:b0:43c:f969:13c0 with SMTP id 5b1f17b1804b1-43d3ba284edmr36280275e9.29.1742314719101; Tue, 18 Mar 2025 09:18:39 -0700 (PDT) Date: Tue, 18 Mar 2025 16:18:21 +0000 In-Reply-To: <20250318161823.4005529-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318161823.4005529-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318161823.4005529-8-tabba@google.com> Subject: [PATCH v7 7/9] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Add arm64 support for handling guest page faults on guest_memfd backed memslots. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 65 +++++++++++++++++++++++++++------------- include/linux/kvm_host.h | 5 ++++ virt/kvm/kvm_main.c | 5 ---- 3 files changed, 50 insertions(+), 25 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 887ffa1f5b14..adb0681fc1c6 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1454,6 +1454,30 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static kvm_pfn_t faultin_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, bool write_fault, bool *writable, + struct page **page, bool is_private) +{ + kvm_pfn_t pfn; + int ret; + + if (!is_private) + return __kvm_faultin_pfn(slot, gfn, write_fault ? FOLL_WRITE : 0, writable, page); + + *writable = false; + + ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, page, NULL); + if (!ret) { + *writable = !memslot_is_readonly(slot); + return pfn; + } + + if (ret == -EHWPOISON) + return KVM_PFN_ERR_HWPOISON; + + return KVM_PFN_ERR_NOSLOT_MASK; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1461,19 +1485,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, { int ret = 0; bool write_fault, writable; - bool exec_fault, mte_allowed; + bool exec_fault, mte_allowed = false; bool device = false, vfio_allow_any_uc = false; unsigned long mmu_seq; phys_addr_t ipa = fault_ipa; struct kvm *kvm = vcpu->kvm; - struct vm_area_struct *vma; + struct vm_area_struct *vma = NULL; short vma_shift; void *memcache; - gfn_t gfn; + gfn_t gfn = ipa >> PAGE_SHIFT; kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); - bool force_pte = logging_active || is_protected_kvm_enabled(); - long vma_pagesize, fault_granule; + bool is_gmem = kvm_mem_is_private(kvm, gfn); + bool force_pte = logging_active || is_gmem || is_protected_kvm_enabled(); + long vma_pagesize, fault_granule = PAGE_SIZE; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; struct page *page; @@ -1510,16 +1535,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return ret; } + mmap_read_lock(current->mm); + /* * Let's check if we will get back a huge page backed by hugetlbfs, or * get block mapping for device MMIO region. */ - mmap_read_lock(current->mm); - vma = vma_lookup(current->mm, hva); - if (unlikely(!vma)) { - kvm_err("Failed to find VMA for hva 0x%lx\n", hva); - mmap_read_unlock(current->mm); - return -EFAULT; + if (!is_gmem) { + vma = vma_lookup(current->mm, hva); + if (unlikely(!vma)) { + kvm_err("Failed to find VMA for hva 0x%lx\n", hva); + mmap_read_unlock(current->mm); + return -EFAULT; + } + + vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; + mte_allowed = kvm_vma_mte_allowed(vma); } if (force_pte) @@ -1590,18 +1621,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ipa &= ~(vma_pagesize - 1); } - gfn = ipa >> PAGE_SHIFT; - mte_allowed = kvm_vma_mte_allowed(vma); - - vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; - /* Don't use the VMA after the unlock -- it may have vanished */ vma = NULL; /* * Read mmu_invalidate_seq so that KVM can detect if the results of - * vma_lookup() or __kvm_faultin_pfn() become stale prior to - * acquiring kvm->mmu_lock. + * vma_lookup() or faultin_pfn() become stale prior to acquiring + * kvm->mmu_lock. * * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs * with the smp_wmb() in kvm_mmu_invalidate_end(). @@ -1609,8 +1635,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmu_seq = vcpu->kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); - pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, - &writable, &page); + pfn = faultin_pfn(kvm, memslot, gfn, write_fault, &writable, &page, is_gmem); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 3d5595a71a2a..ec3bedc18eab 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1882,6 +1882,11 @@ static inline int memslot_id(struct kvm *kvm, gfn_t gfn) return gfn_to_memslot(kvm, gfn)->id; } +static inline bool memslot_is_readonly(const struct kvm_memory_slot *slot) +{ + return slot->flags & KVM_MEM_READONLY; +} + static inline gfn_t hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot) { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 38f0f402ea46..3e40acb9f5c0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2624,11 +2624,6 @@ unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn) return size; } -static bool memslot_is_readonly(const struct kvm_memory_slot *slot) -{ - return slot->flags & KVM_MEM_READONLY; -} - static unsigned long __gfn_to_hva_many(const struct kvm_memory_slot *slot, gfn_t gfn, gfn_t *nr_pages, bool write) { From patchwork Tue Mar 18 16:18:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874485 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8DF9211A15 for ; Tue, 18 Mar 2025 16:18:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314726; cv=none; b=mNfDamOBhH+XcsKL3lH/H8Ecdv+kziRd2g8fkCgyfg13C/xEqF4hW6P8FxQJqngHzkJA85+vBbk3K8KOe6tJjf/oxIhEzOTqMIpqRPr44+sz2daqhWm/BAHUz4xKiEwI/LUUxwUVoXOBNuPH04wgNuE/4JWlJ1EWWeYXkBmInUE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314726; c=relaxed/simple; bh=0353GbWNtiypns170IbaABKpnClWl7kFDc76MwvYHrQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CoFrdChXId6bPhmsegVsfDIb3eZfOoWDL92nQo33KV9ZQdZZrBwhOAYhxY+qMFMk6LMmz43VMVSKIem0oGEnlu/oO2C/ghEFxqhHTOY1fSurLFvpyDqXfqLDoO6TOYAZ2kESTdUH+T/CrtxZ8g033KEl/rQzcvjgf9A+OHnT0tc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IKgyblz5; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IKgyblz5" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43d01024089so28523515e9.1 for ; Tue, 18 Mar 2025 09:18:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314723; x=1742919523; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YWV37Fu+tJoPMYhPX//LQY4FHmLdXa7vPP9UtI4St3c=; b=IKgyblz5E+4HBMp0XM5HI52Fn5E3cxAfAmJzLoMN5JhUIim44FB0EQDYxS5ca4NpX5 Jv71rMm5eEgbGqyzZ6ZMPkFtqMjrbdFXgIPC3Ov+WAE3mSKY8Neo0Jb1BYp7EUcPKm7/ DN+ek6ZZ+rZZFBFJvqHO2GMx1CsL1GW6rJFlbe3rNK64kvoTosoCD++VuMUP7d2RjE2y lmYF5ruURGRY2eczyHO+4HKJRLx/8GFjAme07wX7ts0wseNYQzvh8jN3EzKhLyti06Rf 9KWytlvuvKNszGVw0wWakZq2NGk0LftcM6F6evclngOYMnxK2SdUPlwPfjBGdOk5TR3Z 9YkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314723; x=1742919523; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YWV37Fu+tJoPMYhPX//LQY4FHmLdXa7vPP9UtI4St3c=; b=jiWVmP7Dq5VN9dCFZkXZHoD7FWqn0VQUUy0Q1Yq6f6XQJx/dzd8i16jDObxPlKqVcr X42Q7mXPzUk3JxPxxNeN764q9KGDgqZrViAplkiYh8oSozX6J3848bZ1kiS7xf4EZkJA fbJXDF28VqZ7Gb/1F2UkCDieXHW7bU+snBstEvLPmk0v3kkuTMtTwShcxHBvWRbSx8rv 3pRBHCIQt5MI6El8z4t+UQFP7PZg15wgGU79nUE34W41Mt7n/wPkGCx7ttRyUv2t4FnD 2OOJOTSwGsEBhld1Bh0AOJdOnp7naG0alxlL9OTrxeJuCnBFFKFS2BEM0xGB3gC3Q50v 5VFw== X-Forwarded-Encrypted: i=1; AJvYcCV6ocESsImPRyY6xmior/dv/S7fDY7Ssza/ruij1l0iuReK6Kb6kmFXs+l6QXjb/Qqid8485G+f0gahdzbs@vger.kernel.org X-Gm-Message-State: AOJu0Yx4Hv0edpzNm2lT1Tz61I0SnS6pZ/dR8/fWy81dbINRI2DGBOQ4 eEvC2zp9a9QzAod2dKDcPLu2XbemCrAe3vUGn0SJflQb3SddVRTrJb+BOTl05g/NuJVneiIWWA= = X-Google-Smtp-Source: AGHT+IGRTYPnfa8CqiXnFH9ZNBBgKbN/7IfwN+GVa+iyNbSKldeysOLwqVRPA45JgaTBntykYDM/9tDKvA== X-Received: from wmbez8.prod.google.com ([2002:a05:600c:83c8:b0:43c:fce2:1db2]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1909:b0:439:9a40:aa0b with SMTP id 5b1f17b1804b1-43d40ec6496mr3290515e9.25.1742314723020; Tue, 18 Mar 2025 09:18:43 -0700 (PDT) Date: Tue, 18 Mar 2025 16:18:23 +0000 In-Reply-To: <20250318161823.4005529-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318161823.4005529-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318161823.4005529-10-tabba@google.com> Subject: [PATCH v7 9/9] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Expand the guest_memfd selftests to include testing mapping guest memory for VM types that support it. Also, build the guest_memfd selftest for arm64. Signed-off-by: Fuad Tabba --- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 75 +++++++++++++++++-- 2 files changed, 70 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index 4277b983cace..c9a3f30e28dd 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -160,6 +160,7 @@ TEST_GEN_PROGS_arm64 += coalesced_io_test TEST_GEN_PROGS_arm64 += demand_paging_test TEST_GEN_PROGS_arm64 += dirty_log_test TEST_GEN_PROGS_arm64 += dirty_log_perf_test +TEST_GEN_PROGS_arm64 += guest_memfd_test TEST_GEN_PROGS_arm64 += guest_print_test TEST_GEN_PROGS_arm64 += get-reg-list TEST_GEN_PROGS_arm64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index ce687f8d248f..38c501e49e0e 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -34,12 +34,48 @@ static void test_file_read_write(int fd) "pwrite on a guest_mem fd should fail"); } -static void test_mmap(int fd, size_t page_size) +static void test_mmap_allowed(int fd, size_t total_size) { + size_t page_size = getpagesize(); + const char val = 0xaa; + char *mem; + int ret; + int i; + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass."); + + memset(mem, val, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], val); + + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, + page_size); + TEST_ASSERT(!ret, "fallocate the first page should succeed"); + + for (i = 0; i < page_size; i++) + TEST_ASSERT_EQ(mem[i], 0x00); + for (; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], val); + + memset(mem, val, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], val); + + ret = munmap(mem, total_size); + TEST_ASSERT(!ret, "munmap should succeed"); +} + +static void test_mmap_denied(int fd, size_t total_size) +{ + size_t page_size = getpagesize(); char *mem; mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); TEST_ASSERT_EQ(mem, MAP_FAILED); + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT_EQ(mem, MAP_FAILED); } static void test_file_size(int fd, size_t page_size, size_t total_size) @@ -170,19 +206,27 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm) close(fd1); } -int main(int argc, char *argv[]) +unsigned long get_shared_type(void) { - size_t page_size; +#ifdef __x86_64__ + return KVM_X86_SW_PROTECTED_VM; +#endif + return 0; +} + +void test_vm_type(unsigned long type, bool is_shared) +{ + struct kvm_vm *vm; size_t total_size; + size_t page_size; int fd; - struct kvm_vm *vm; TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); page_size = getpagesize(); total_size = page_size * 4; - vm = vm_create_barebones(); + vm = vm_create_barebones_type(type); test_create_guest_memfd_invalid(vm); test_create_guest_memfd_multiple(vm); @@ -190,10 +234,29 @@ int main(int argc, char *argv[]) fd = vm_create_guest_memfd(vm, total_size, 0); test_file_read_write(fd); - test_mmap(fd, page_size); + + if (is_shared) + test_mmap_allowed(fd, total_size); + else + test_mmap_denied(fd, total_size); + test_file_size(fd, page_size, total_size); test_fallocate(fd, page_size, total_size); test_invalid_punch_hole(fd, page_size, total_size); close(fd); + kvm_vm_release(vm); +} + +int main(int argc, char *argv[]) +{ +#ifndef __aarch64__ + /* For now, arm64 only supports shared guest memory. */ + test_vm_type(VM_TYPE_DEFAULT, false); +#endif + + if (kvm_has_cap(KVM_CAP_GMEM_SHARED_MEM)) + test_vm_type(get_shared_type(), true); + + return 0; }