From patchwork Wed Dec 7 01:49:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0F4BC352A1 for ; Wed, 7 Dec 2022 01:49:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229839AbiLGBt4 (ORCPT ); Tue, 6 Dec 2022 20:49:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229726AbiLGBty (ORCPT ); Tue, 6 Dec 2022 20:49:54 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F9BE528B0; Tue, 6 Dec 2022 17:49:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377792; x=1701913792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=21DVTJoUt4qdmIcbuPAPUd5xHsbd/8yH6qj8tM2+hM0=; b=QIpWqZZSZLnPwLQSTy64HsRKNlHvDctJvfl/Bx451CR7TO2rFm7gYLoZ zVN3coa/uQ1tCD0nvt/V1yftAgR4u23XAzNIXmliWtpbrs9GuV6YBttjv WXV3eYAmHX5MerzJeHHdizBzHIPh29MK2XebVmjxnObeQZ8Y0lxvjyhB+ QOl7WEXTCNoXTzupaWtGXpdDxPNb70CTEcOv594++hj7gozhFM7HML9D9 MIeFYcJcWKMoqHvLcaqZpWLE+1jjMoUVdanRQ/TmLvTBCR2vDV9SA2Mjd GKidijuinMNn4OHa48rpDFOo0RgLf1dxvx+3iGjgqWkWaRBBZ31+fhOJu A==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="315494450" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="315494450" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:50 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="646427598" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="646427598" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:41 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 40917109C86; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv8 02/14] mm: Add support for unaccepted memory Date: Wed, 7 Dec 2022 04:49:21 +0300 Message-Id: <20221207014933.8435-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org UEFI Specification version 2.9 introduces the concept of memory acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, require memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific to the Virtual Machine platform. There are several ways kernel can deal with unaccepted memory: 1. Accept all the memory during the boot. It is easy to implement and it doesn't have runtime cost once the system is booted. The downside is very long boot time. Accept can be parallelized to multiple CPUs to keep it manageable (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate memory bandwidth and does not scale beyond the point. 2. Accept a block of memory on the first use. It requires more infrastructure and changes in page allocator to make it work, but it provides good boot time. On-demand memory accept means latency spikes every time kernel steps onto a new memory block. The spikes will go away once workload data set size gets stabilized or all memory gets accepted. 3. Accept all memory in background. Introduce a thread (or multiple) that gets memory accepted proactively. It will minimize time the system experience latency spikes on memory allocation while keeping low boot time. This approach cannot function on its own. It is an extension of #2: background memory acceptance requires functional scheduler, but the page allocator may need to tap into unaccepted memory before that. The downside of the approach is that these threads also steal CPU cycles and memory bandwidth from the user's workload and may hurt user experience. Implement #2 for now. It is a reasonable default. Some workloads may want to use #1 or #3 and they can be implemented later based on user's demands. Support of unaccepted memory requires a few changes in core-mm code: - memblock has to accept memory on allocation; - page allocator has to accept memory on the first allocation of the page; Memblock change is trivial. The page allocator is modified to accept pages. New memory gets accepted before putting pages on free lists. It is done lazily: only accept new pages when we run out of already accepted memory. Architecture has to provide two helpers if it wants to support unaccepted memory: - accept_memory() makes a range of physical addresses accepted. - range_contains_unaccepted_memory() checks anything within the range of physical addresses requires acceptance. Signed-off-by: Kirill A. Shutemov Acked-by: Mike Rapoport # memblock --- include/linux/mmzone.h | 5 ++ include/linux/page-flags.h | 24 ++++++++ mm/internal.h | 12 ++++ mm/memblock.c | 9 +++ mm/page_alloc.c | 119 +++++++++++++++++++++++++++++++++++++ 5 files changed, 169 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 5f74891556f3..da335381e63f 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -822,6 +822,11 @@ struct zone { /* free areas of different sizes */ struct free_area free_area[MAX_ORDER]; +#ifdef CONFIG_UNACCEPTED_MEMORY + /* pages to be accepted */ + struct list_head unaccepted_pages; +#endif + /* zone flags, see below */ unsigned long flags; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 0b0ae5084e60..ce953be8fe10 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -941,6 +941,7 @@ static inline bool is_page_hwpoison(struct page *page) #define PG_offline 0x00000100 #define PG_table 0x00000200 #define PG_guard 0x00000400 +#define PG_unaccepted 0x00000800 #define PageType(page, flag) \ ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE) @@ -966,6 +967,18 @@ static __always_inline void __ClearPage##uname(struct page *page) \ page->page_type |= PG_##lname; \ } +#define PAGE_TYPE_OPS_FALSE(uname) \ +static __always_inline int Page##uname(struct page *page) \ +{ \ + return false; \ +} \ +static __always_inline void __SetPage##uname(struct page *page) \ +{ \ +} \ +static __always_inline void __ClearPage##uname(struct page *page) \ +{ \ +} + /* * PageBuddy() indicates that the page is free and in the buddy system * (see mm/page_alloc.c). @@ -996,6 +1009,17 @@ PAGE_TYPE_OPS(Buddy, buddy) */ PAGE_TYPE_OPS(Offline, offline) +/* + * PageUnaccepted() indicates that the page has to be "accepted" before it can + * be read or written. The page allocator must call accept_page() before + * touching the page or returning it to the caller. + */ +#ifdef CONFIG_UNACCEPTED_MEMORY +PAGE_TYPE_OPS(Unaccepted, unaccepted) +#else +PAGE_TYPE_OPS_FALSE(Unaccepted) +#endif + extern void page_offline_freeze(void); extern void page_offline_thaw(void); extern void page_offline_begin(void); diff --git a/mm/internal.h b/mm/internal.h index 6b7ef495b56d..8ef4f88608ad 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -856,4 +856,16 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) return !(vma->vm_flags & VM_SOFTDIRTY); } +#ifndef CONFIG_UNACCEPTED_MEMORY +static inline bool range_contains_unaccepted_memory(phys_addr_t start, + phys_addr_t end) +{ + return false; +} + +static inline void accept_memory(phys_addr_t start, phys_addr_t end) +{ +} +#endif + #endif /* __MM_INTERNAL_H */ diff --git a/mm/memblock.c b/mm/memblock.c index 511d4783dcf1..3bc404a5352a 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1423,6 +1423,15 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, */ kmemleak_alloc_phys(found, size, 0); + /* + * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, + * require memory to be accepted before it can be used by the + * guest. + * + * Accept the memory of the allocated buffer. + */ + accept_memory(found, found + size); + return found; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6e60657875d3..6d597e833a73 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -450,6 +450,11 @@ EXPORT_SYMBOL(nr_online_nodes); int page_group_by_mobility_disabled __read_mostly; +#ifdef CONFIG_UNACCEPTED_MEMORY +/* Counts number of zones with unaccepted pages. */ +static DEFINE_STATIC_KEY_FALSE(unaccepted_pages); +#endif + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* * During boot we initialize deferred pages on-demand, as needed, but once @@ -1043,12 +1048,15 @@ static inline void move_to_free_list(struct page *page, struct zone *zone, { struct free_area *area = &zone->free_area[order]; + VM_BUG_ON_PAGE(PageUnevictable(page), page); list_move_tail(&page->buddy_list, &area->free_list[migratetype]); } static inline void del_page_from_free_list(struct page *page, struct zone *zone, unsigned int order) { + VM_BUG_ON_PAGE(PageUnevictable(page), page); + /* clear reported state and update reported page count */ if (page_reported(page)) __ClearPageReported(page); @@ -1728,6 +1736,97 @@ static void __free_pages_ok(struct page *page, unsigned int order, __count_vm_events(PGFREE, 1 << order); } +static bool page_contains_unaccepted(struct page *page, unsigned int order) +{ + phys_addr_t start = page_to_phys(page); + phys_addr_t end = start + (PAGE_SIZE << order); + + return range_contains_unaccepted_memory(start, end); +} + +static void accept_page(struct page *page, unsigned int order) +{ + phys_addr_t start = page_to_phys(page); + + accept_memory(start, start + (PAGE_SIZE << order)); +} + +#ifdef CONFIG_UNACCEPTED_MEMORY + +static bool try_to_accept_memory(struct zone *zone) +{ + unsigned long flags, order; + struct page *page; + bool last = false; + int migratetype; + + if (!static_branch_unlikely(&unaccepted_pages)) + return false; + + spin_lock_irqsave(&zone->lock, flags); + page = list_first_entry_or_null(&zone->unaccepted_pages, + struct page, lru); + if (!page) { + spin_unlock_irqrestore(&zone->lock, flags); + return false; + } + + list_del(&page->lru); + last = list_empty(&zone->unaccepted_pages); + + order = page->private; + VM_BUG_ON(order > MAX_ORDER || order < pageblock_order); + + migratetype = get_pfnblock_migratetype(page, page_to_pfn(page)); + __mod_zone_freepage_state(zone, -1 << order, migratetype); + spin_unlock_irqrestore(&zone->lock, flags); + + if (last) + static_branch_dec(&unaccepted_pages); + + accept_page(page, order); + __ClearPageUnaccepted(page); + __free_pages_ok(page, order, FPI_TO_TAIL | FPI_SKIP_KASAN_POISON); + + return true; +} + +static void __free_unaccepted(struct page *page, unsigned int order) +{ + struct zone *zone = page_zone(page); + unsigned long flags; + int migratetype; + bool first = false; + + VM_BUG_ON(order > MAX_ORDER || order < pageblock_order); + __SetPageUnaccepted(page); + page->private = order; + + spin_lock_irqsave(&zone->lock, flags); + first = list_empty(&zone->unaccepted_pages); + migratetype = get_pfnblock_migratetype(page, page_to_pfn(page)); + list_add_tail(&page->lru, &zone->unaccepted_pages); + __mod_zone_freepage_state(zone, 1 << order, migratetype); + spin_unlock_irqrestore(&zone->lock, flags); + + if (first) + static_branch_inc(&unaccepted_pages); +} + +#else + +static bool try_to_accept_memory(struct zone *zone) +{ + return false; +} + +static void __free_unaccepted(struct page *page, unsigned int order) +{ + BUILD_BUG(); +} + +#endif /* CONFIG_UNACCEPTED_MEMORY */ + void __free_pages_core(struct page *page, unsigned int order) { unsigned int nr_pages = 1 << order; @@ -1750,6 +1849,13 @@ void __free_pages_core(struct page *page, unsigned int order) atomic_long_add(nr_pages, &page_zone(page)->managed_pages); + if (page_contains_unaccepted(page, order)) { + if (order >= pageblock_order) + return __free_unaccepted(page, order); + else + accept_page(page, order); + } + /* * Bypass PCP and place fresh pages right to the tail, primarily * relevant for memory onlining. @@ -1910,6 +2016,9 @@ static void __init deferred_free_range(unsigned long pfn, return; } + /* Accept chunks smaller than page-block upfront */ + accept_memory(PFN_PHYS(pfn), PFN_PHYS(pfn + nr_pages)); + for (i = 0; i < nr_pages; i++, page++, pfn++) { if (pageblock_aligned(pfn)) set_pageblock_migratetype(page, MIGRATE_MOVABLE); @@ -4247,6 +4356,9 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, gfp_mask)) { int ret; + if (try_to_accept_memory(zone)) + goto try_this_zone; + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* * Watermark failed for this zone, but see if we can @@ -4299,6 +4411,9 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, return page; } else { + if (try_to_accept_memory(zone)) + goto try_this_zone; + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* Try again if zone has deferred pages */ if (static_branch_unlikely(&deferred_pages)) { @@ -6935,6 +7050,10 @@ static void __meminit zone_init_free_lists(struct zone *zone) INIT_LIST_HEAD(&zone->free_area[order].free_list[t]); zone->free_area[order].nr_free = 0; } + +#ifdef CONFIG_UNACCEPTED_MEMORY + INIT_LIST_HEAD(&zone->unaccepted_pages); +#endif } /* From patchwork Wed Dec 7 01:49:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50FF4C63707 for ; Wed, 7 Dec 2022 01:49:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229799AbiLGBtz (ORCPT ); Tue, 6 Dec 2022 20:49:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229679AbiLGBtx (ORCPT ); Tue, 6 Dec 2022 20:49:53 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AD8D528AB; Tue, 6 Dec 2022 17:49:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377792; x=1701913792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u+bYwyFm4XjH9XBhSQUvtr3jzM1TLHyKc0CMjo+kLEU=; b=XVQJLrRaUd5neqrX2yVzjILOLK/KWr25vuIzqj6+2JP/dk1dr7loSULx XRo8aK9CwgqRNpdAz0ENxQvc44yvIeWtxlABOJrOXiWnK4Iutm8J7xwoM pFkYe1TwrT9RZ4+CxGfIWF32fmqd/49lFGOQuIn7kFM2SzAaviTcBzQqs MzuhzvUqM2qyvFLSjGNlUwHT2c8ay1WKkDRKXdhV5u87UFbNtytOjIIM9 1UminUgMO2lrDiHRUJOhJK9QEYKvMUW6gWnx/rTbT5QjS2ZxQsSEi/rGa yG74sdN2oU4gsSjG2NZNNrwwvUUt81hVgabDOadog6PAfKVPIT5ve08Ce g==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="315494434" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="315494434" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:50 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="646427593" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="646427593" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:41 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 4782B109C87; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 03/14] mm: Report unaccepted memory in meminfo Date: Wed, 7 Dec 2022 04:49:22 +0300 Message-Id: <20221207014933.8435-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Track amount of unaccepted memory and report it in /proc/meminfo and in node meminfo. Signed-off-by: Kirill A. Shutemov --- drivers/base/node.c | 7 +++++++ fs/proc/meminfo.c | 5 +++++ include/linux/mmzone.h | 3 +++ mm/page_alloc.c | 2 ++ mm/vmstat.c | 1 + 5 files changed, 18 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index faf3597a96da..ca6f0590be21 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -448,6 +448,9 @@ static ssize_t node_read_meminfo(struct device *dev, "Node %d ShmemPmdMapped: %8lu kB\n" "Node %d FileHugePages: %8lu kB\n" "Node %d FilePmdMapped: %8lu kB\n" +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + "Node %d Unaccepted: %8lu kB\n" #endif , nid, K(node_page_state(pgdat, NR_FILE_DIRTY)), @@ -477,6 +480,10 @@ static ssize_t node_read_meminfo(struct device *dev, nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED)) +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + , + nid, K(node_page_state(pgdat, NR_UNACCEPTED)) #endif ); len += hugetlb_report_node_meminfo(buf, len, nid); diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 440960110a42..789b77c7b6df 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -155,6 +155,11 @@ static int meminfo_proc_show(struct seq_file *m, void *v) global_zone_page_state(NR_FREE_CMA_PAGES)); #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + show_val_kb(m, "Unaccepted: ", + global_node_page_state(NR_UNACCEPTED)); +#endif + hugetlb_report_meminfo(m); arch_report_meminfo(m); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index da335381e63f..9c762e8175fc 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -198,6 +198,9 @@ enum node_stat_item { NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */ NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */ NR_KERNEL_STACK_KB, /* measured in KiB */ +#ifdef CONFIG_UNACCEPTED_MEMORY + NR_UNACCEPTED, +#endif #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) NR_KERNEL_SCS_KB, /* measured in KiB */ #endif diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6d597e833a73..e80e8d398863 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1779,6 +1779,7 @@ static bool try_to_accept_memory(struct zone *zone) migratetype = get_pfnblock_migratetype(page, page_to_pfn(page)); __mod_zone_freepage_state(zone, -1 << order, migratetype); + __mod_node_page_state(page_pgdat(page), NR_UNACCEPTED, -1 << order); spin_unlock_irqrestore(&zone->lock, flags); if (last) @@ -1807,6 +1808,7 @@ static void __free_unaccepted(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, page_to_pfn(page)); list_add_tail(&page->lru, &zone->unaccepted_pages); __mod_zone_freepage_state(zone, 1 << order, migratetype); + __mod_node_page_state(page_pgdat(page), NR_UNACCEPTED, 1 << order); spin_unlock_irqrestore(&zone->lock, flags); if (first) diff --git a/mm/vmstat.c b/mm/vmstat.c index b2371d745e00..fb15213be374 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1230,6 +1230,7 @@ const char * const vmstat_text[] = { "nr_foll_pin_acquired", "nr_foll_pin_released", "nr_kernel_stack", + "nr_unaccepted", #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) "nr_shadow_call_stack", #endif From patchwork Wed Dec 7 01:49:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632716 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A19D2C3A5A7 for ; Wed, 7 Dec 2022 01:50:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229986AbiLGBuo (ORCPT ); Tue, 6 Dec 2022 20:50:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229845AbiLGBuU (ORCPT ); Tue, 6 Dec 2022 20:50:20 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AC2A537C9; Tue, 6 Dec 2022 17:50:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377803; x=1701913803; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SLFpRTCJa55e8jNeDSo+ACXNzeUqbaJWH6UnEfk1F10=; b=aiJPCHXPCI9NZSxutl+m+OGSjJpaZL5PlbyJVnxpOImdlVTQDzsO4SKL B2cqzVqirqLKUYk1hszAukKhuabQmeYsiK58jD7S5+8BQlv2dm5THojGR b8tq0ixSEk9q18NnqR3e+t1NcGN9uxhsxNuTzW3kiflY/To7g+j2M27OG 9NfPFARvGTJYNhkJJKTYgprPpWQXuuiTRoNKJ6RXW2/eKWWOR6g3Y/vDO WfrjodYyZ7nz5cggya2y79zQi2W4lFohGoIlO6junyU0XmJ7XutyOfV6+ mlaoKndJqfzV2+PG0uOiqmVJwFvo1Kt1MDdmCZzHQeXygPGwQcrmyMooW Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="315494517" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="315494517" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:50:01 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="646427690" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="646427690" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:52 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 6978B109C8A; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 06/14] efi/x86: Implement support for unaccepted memory Date: Wed, 7 Dec 2022 04:49:25 +0300 Message-Id: <20221207014933.8435-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 has to be modified on every page acceptance. It leads to table fragmentation, but there's a limited number of entries in the e820 table Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents 2MiB in the address space: one 4k page is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to 2M gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via boot_params. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the maximum address in the memory map. The same boot_params.unaccepted_memory can be used to pass the bitmap between two kernels on kexec, but the use-case is not yet implemented. The implementation requires some basic helpers in boot stub. They provided by linux/ includes in the main kernel image, but is not present in boot stub. Create copy of required functionality in the boot stub. Signed-off-by: Kirill A. Shutemov --- Documentation/x86/zero-page.rst | 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/boot/compressed/mem.c | 73 ++++++++++++++++++++++++ arch/x86/include/asm/unaccepted_memory.h | 10 ++++ arch/x86/include/uapi/asm/bootparam.h | 2 +- drivers/firmware/efi/Kconfig | 14 +++++ drivers/firmware/efi/efi.c | 1 + drivers/firmware/efi/libstub/x86-stub.c | 68 ++++++++++++++++++++++ include/linux/efi.h | 3 +- 9 files changed, 171 insertions(+), 2 deletions(-) create mode 100644 arch/x86/boot/compressed/mem.c create mode 100644 arch/x86/include/asm/unaccepted_memory.h diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-page.rst index 45aa9cceb4f1..f21905e61ade 100644 --- a/Documentation/x86/zero-page.rst +++ b/Documentation/x86/zero-page.rst @@ -20,6 +20,7 @@ Offset/Size Proto Name Meaning 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information (struct ist_info) 070/008 ALL acpi_rsdp_addr Physical address of ACPI RSDP table +078/008 ALL unaccepted_memory Bitmap of unaccepted memory (1bit == 2M) 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!! 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!! 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table), diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 3dc5db651dd0..0ae221540dee 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -107,6 +107,7 @@ endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c new file mode 100644 index 000000000000..a848119e4455 --- /dev/null +++ b/arch/x86/boot/compressed/mem.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../cpuflags.h" +#include "bitmap.h" +#include "error.h" +#include "math.h" + +#define PMD_SHIFT 21 +#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE - 1)) + +static inline void __accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* Platform-specific memory-acceptance call goes here */ + error("Cannot accept memory"); +} + +/* + * The accepted memory bitmap only works at PMD_SIZE granularity. This + * function takes unaligned start/end addresses and either: + * 1. Accepts the memory immediately and in its entirety + * 2. Accepts unaligned parts, and marks *some* aligned part unaccepted + * + * The function will never reach the bitmap_set() with zero bits to set. + */ +void process_unaccepted_memory(struct boot_params *params, u64 start, u64 end) +{ + /* + * Ensure that at least one bit will be set in the bitmap by + * immediately accepting all regions under 2*PMD_SIZE. This is + * imprecise and may immediately accept some areas that could + * have been represented in the bitmap. But, results in simpler + * code below + * + * Consider case like this: + * + * | 4k | 2044k | 2048k | + * ^ 0x0 ^ 2MB ^ 4MB + * + * Only the first 4k has been accepted. The 0MB->2MB region can not be + * represented in the bitmap. The 2MB->4MB region can be represented in + * the bitmap. But, the 0MB->4MB region is <2*PMD_SIZE and will be + * immediately accepted in its entirety. + */ + if (end - start < 2 * PMD_SIZE) { + __accept_memory(start, end); + return; + } + + /* + * No matter how the start and end are aligned, at least one unaccepted + * PMD_SIZE area will remain to be marked in the bitmap. + */ + + /* Immediately accept a unaccepted_memory, + start / PMD_SIZE, (end - start) / PMD_SIZE); +} diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h new file mode 100644 index 000000000000..df0736d32858 --- /dev/null +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2020 Intel Corporation */ +#ifndef _ASM_X86_UNACCEPTED_MEMORY_H +#define _ASM_X86_UNACCEPTED_MEMORY_H + +struct boot_params; + +void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); + +#endif diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index 01d19fc22346..630a54046af0 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -189,7 +189,7 @@ struct boot_params { __u64 tboot_addr; /* 0x058 */ struct ist_info ist_info; /* 0x060 */ __u64 acpi_rsdp_addr; /* 0x070 */ - __u8 _pad3[8]; /* 0x078 */ + __u64 unaccepted_memory; /* 0x078 */ __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */ __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */ struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */ diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 6787ed8dfacf..8aa8adf0bcb5 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -314,6 +314,20 @@ config EFI_COCO_SECRET virt/coco/efi_secret module to access the secrets, which in turn allows userspace programs to access the injected secrets. +config UNACCEPTED_MEMORY + bool + depends on EFI_STUB + help + Some Virtual Machine platforms, such as Intel TDX, require + some memory to be "accepted" by the guest before it can be used. + This mechanism helps prevent malicious hosts from making changes + to guest memory. + + UEFI specification v2.9 introduced EFI_UNACCEPTED_MEMORY memory type. + + This option adds support for unaccepted memory and makes such memory + usable by the kernel. + config EFI_EMBEDDED_FIRMWARE bool select CRYPTO_LIB_SHA256 diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index a46df5d1d094..f525144e22e4 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -777,6 +777,7 @@ static __initdata char memory_type_name[][13] = { "MMIO Port", "PAL Code", "Persistent", + "Unaccepted", }; char * __init efi_md_typeattr_format(char *buf, size_t size, diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index fff81843169c..27b9eed5883b 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "efistub.h" @@ -613,6 +614,16 @@ setup_e820(struct boot_params *params, struct setup_data *e820ext, u32 e820ext_s e820_type = E820_TYPE_PMEM; break; + case EFI_UNACCEPTED_MEMORY: + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) { + efi_warn_once( +"The system has unaccepted memory, but kernel does not support it\nConsider enabling CONFIG_UNACCEPTED_MEMORY\n"); + continue; + } + e820_type = E820_TYPE_RAM; + process_unaccepted_memory(params, d->phys_addr, + d->phys_addr + PAGE_SIZE * d->num_pages); + break; default: continue; } @@ -677,6 +688,60 @@ static efi_status_t alloc_e820ext(u32 nr_desc, struct setup_data **e820ext, return status; } +static efi_status_t allocate_unaccepted_bitmap(struct boot_params *params, + __u32 nr_desc, + struct efi_boot_memmap *map) +{ + unsigned long *mem = NULL; + u64 size, max_addr = 0; + efi_status_t status; + bool found = false; + int i; + + /* Check if there's any unaccepted memory and find the max address */ + for (i = 0; i < nr_desc; i++) { + efi_memory_desc_t *d; + unsigned long m = (unsigned long)map->map; + + d = efi_early_memdesc_ptr(m, map->desc_size, i); + if (d->type == EFI_UNACCEPTED_MEMORY) + found = true; + if (d->phys_addr + d->num_pages * PAGE_SIZE > max_addr) + max_addr = d->phys_addr + d->num_pages * PAGE_SIZE; + } + + if (!found) { + params->unaccepted_memory = 0; + return EFI_SUCCESS; + } + + /* + * If unaccepted memory is present, allocate a bitmap to track what + * memory has to be accepted before access. + * + * One bit in the bitmap represents 2MiB in the address space: + * A 4k bitmap can track 64GiB of physical address space. + * + * In the worst case scenario -- a huge hole in the middle of the + * address space -- It needs 256MiB to handle 4PiB of the address + * space. + * + * TODO: handle situation if params->unaccepted_memory is already set. + * It's required to deal with kexec. + * + * The bitmap will be populated in setup_e820() according to the memory + * map after efi_exit_boot_services(). + */ + size = DIV_ROUND_UP(max_addr, PMD_SIZE * BITS_PER_BYTE); + status = efi_allocate_pages(size, (unsigned long *)&mem, ULONG_MAX); + if (status == EFI_SUCCESS) { + memset(mem, 0, size); + params->unaccepted_memory = (unsigned long)mem; + } + + return status; +} + static efi_status_t allocate_e820(struct boot_params *params, struct setup_data **e820ext, u32 *e820ext_size) @@ -697,6 +762,9 @@ static efi_status_t allocate_e820(struct boot_params *params, status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size); } + if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) && status == EFI_SUCCESS) + status = allocate_unaccepted_bitmap(params, nr_desc, map); + efi_bs_call(free_pool, map); return status; } diff --git a/include/linux/efi.h b/include/linux/efi.h index 7603fc58c47c..cfdcc165071e 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -108,7 +108,8 @@ typedef struct { #define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 #define EFI_PAL_CODE 13 #define EFI_PERSISTENT_MEMORY 14 -#define EFI_MAX_MEMORY_TYPE 15 +#define EFI_UNACCEPTED_MEMORY 15 +#define EFI_MAX_MEMORY_TYPE 16 /* Attribute values: */ #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ From patchwork Wed Dec 7 01:49:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632715 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFB19C4708D for ; Wed, 7 Dec 2022 01:51:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230082AbiLGBvF (ORCPT ); Tue, 6 Dec 2022 20:51:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230009AbiLGBuj (ORCPT ); Tue, 6 Dec 2022 20:50:39 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 010DB537DC; Tue, 6 Dec 2022 17:50:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377805; x=1701913805; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DsLSYB0ldaIutEa0HLWA8q6UCSqPp2KzK7qJuCNXS/4=; b=nC0xE1HvoJtQLyCgd49iSjmGQmoCS3RC7x4pFVRLE2zx1eCVFDdQXbfE oZMxd5uyFUKyCdj5bjomAbL51i9Ckh08AMu0sIz79QLoqrprXX9TG63KW +Ba/qGT23keHGxDJNp6YW6nYzwgd7ELw3gDutObVttCFMFYfqNnm1nl4K Ss9/VNJHJ+kCCsgRU5XD24297nhUAPXVcF5Gx5iSMkf2IYq+7XGkj3CDW FGqO/I+eOHFK4C/rj0ijZiHxc8w/8y2g5fLysWwjXAvlnbWGBW0OTPwIo LcxLMHoyjMCZaP2PPcSgGc2LhyK1QFDdWjMN8jVk9VYcIn1uusvXALV9v w==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="315494594" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="315494594" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:50:02 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="646427717" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="646427717" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:53 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 8CDBF109C8D; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 09/14] x86/mm: Provide helpers for unaccepted memory Date: Wed, 7 Dec 2022 04:49:28 +0300 Message-Id: <20221207014933.8435-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Core-mm requires few helpers to support unaccepted memory: - accept_memory() checks the range of addresses against the bitmap and accept memory if needed. - range_contains_unaccepted_memory() checks if anything within the range requires acceptance. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/page.h | 3 ++ arch/x86/include/asm/unaccepted_memory.h | 4 ++ arch/x86/mm/Makefile | 2 + arch/x86/mm/unaccepted_memory.c | 61 ++++++++++++++++++++++++ 4 files changed, 70 insertions(+) create mode 100644 arch/x86/mm/unaccepted_memory.c diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index 9cc82f305f4b..df4ec3a988dc 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -19,6 +19,9 @@ struct page; #include + +#include + extern struct range pfn_mapped[]; extern int nr_pfn_mapped; diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h index 41fbfc798100..89fc91c61560 100644 --- a/arch/x86/include/asm/unaccepted_memory.h +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -7,6 +7,10 @@ struct boot_params; void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); +#ifdef CONFIG_UNACCEPTED_MEMORY + void accept_memory(phys_addr_t start, phys_addr_t end); +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end); #endif +#endif diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index c80febc44cd2..b0ef1755e5c8 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -67,3 +67,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o + +obj-$(CONFIG_UNACCEPTED_MEMORY) += unaccepted_memory.o diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c new file mode 100644 index 000000000000..1df918b21469 --- /dev/null +++ b/arch/x86/mm/unaccepted_memory.c @@ -0,0 +1,61 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include + +#include +#include +#include + +/* Protects unaccepted memory bitmap */ +static DEFINE_SPINLOCK(unaccepted_memory_lock); + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long range_start, range_end; + unsigned long *bitmap; + unsigned long flags; + + if (!boot_params.unaccepted_memory) + return; + + bitmap = __va(boot_params.unaccepted_memory); + range_start = start / PMD_SIZE; + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + for_each_set_bitrange_from(range_start, range_end, bitmap, + DIV_ROUND_UP(end, PMD_SIZE)) { + unsigned long len = range_end - range_start; + + /* Platform-specific memory-acceptance call goes here */ + panic("Cannot accept memory: unknown platform\n"); + bitmap_clear(bitmap, range_start, len); + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); +} + +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long *bitmap; + unsigned long flags; + bool ret = false; + + if (!boot_params.unaccepted_memory) + return 0; + + bitmap = __va(boot_params.unaccepted_memory); + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + while (start < end) { + if (test_bit(start / PMD_SIZE, bitmap)) { + ret = true; + break; + } + + start += PMD_SIZE; + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); + + return ret; +} From patchwork Wed Dec 7 01:49:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 682F7C352A1 for ; Wed, 7 Dec 2022 01:50:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229991AbiLGBuW (ORCPT ); Tue, 6 Dec 2022 20:50:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229967AbiLGBuS (ORCPT ); Tue, 6 Dec 2022 20:50:18 -0500 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 847B4532C9; Tue, 6 Dec 2022 17:50:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377801; x=1701913801; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4ZWHE8KNs9LD2gK29L+GSS8ovmdLBSodlS7K4yGoZwg=; b=b+tzc5iWH+w2K98w/pM61E3YYVoOZb81+1vo73SQNiZ2yMGzTmfuHtd/ Pn14Ce8FY1LUUvp8GRDS0BdlZ7pDYz5bnTvb5LQ54kmiowrpC2Zf4seQE jCAFW+E7m44jf3j3ejGHBn1WLC6IsTd8qGHJhUqHUfnB/F/XpjsoaFaZG 8f9reMzXbNLKAS5ZxzCI+UPaJMVokJYRQr4GbdSNE7M3poobqJtE14+bv 7lQBfa2WwubxGmfH8Uc7aB4JxCJnn2rfdzCHdaf+yqnFv7MyfP/kHEAiP apo2GaSaOhALmu6Ppeud5in7vrCLyHGXK1HSeAfWCb8PtJrijWnYa6U5r Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="316794452" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="316794452" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:50:01 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="640082289" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="640082289" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:53 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id A268A109C8F; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 11/14] x86: Disable kexec if system has unaccepted memory Date: Wed, 7 Dec 2022 04:49:30 +0300 Message-Id: <20221207014933.8435-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org On kexec, the target kernel has to know what memory has been accepted. Information in EFI map is out of date and cannot be used. boot_params.unaccepted_memory can be used to pass the bitmap between two kernels on kexec, but the use-case is not yet implemented. Disable kexec on machines with unaccepted memory for now. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/kexec.h | 5 +++++ arch/x86/mm/unaccepted_memory.c | 16 ++++++++++++++++ include/linux/kexec.h | 7 +++++++ kernel/kexec.c | 4 ++++ kernel/kexec_file.c | 4 ++++ 5 files changed, 36 insertions(+) diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h index a3760ca796aa..87abab578154 100644 --- a/arch/x86/include/asm/kexec.h +++ b/arch/x86/include/asm/kexec.h @@ -189,6 +189,11 @@ extern void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages); void arch_kexec_protect_crashkres(void); #define arch_kexec_protect_crashkres arch_kexec_protect_crashkres +#ifdef CONFIG_UNACCEPTED_MEMORY +int arch_kexec_load(void); +#define arch_kexec_load arch_kexec_load +#endif + void arch_kexec_unprotect_crashkres(void); #define arch_kexec_unprotect_crashkres arch_kexec_unprotect_crashkres diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index a0a58486eb74..1745e6a65024 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -1,4 +1,5 @@ // SPDX-License-Identifier: GPL-2.0-only +#include #include #include #include @@ -98,3 +99,18 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) return ret; } + +#ifdef CONFIG_KEXEC_CORE +int arch_kexec_load(void) +{ + if (!boot_params.unaccepted_memory) + return 0; + + /* + * TODO: Information on memory acceptance status has to be communicated + * between kernel. + */ + pr_warn_once("Disable kexec: not yet supported on systems with unaccepted memory\n"); + return -EOPNOTSUPP; +} +#endif diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 41a686996aaa..6b75051d5271 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -444,6 +444,13 @@ static inline void arch_kexec_protect_crashkres(void) { } static inline void arch_kexec_unprotect_crashkres(void) { } #endif +#ifndef arch_kexec_load +static inline int arch_kexec_load(void) +{ + return 0; +} +#endif + #ifndef page_to_boot_pfn static inline unsigned long page_to_boot_pfn(struct page *page) { diff --git a/kernel/kexec.c b/kernel/kexec.c index cb8e6e6f983c..65dff44b487f 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -192,6 +192,10 @@ static inline int kexec_load_check(unsigned long nr_segments, { int result; + result = arch_kexec_load(); + if (result) + return result; + /* We only trust the superuser with rebooting the system. */ if (!capable(CAP_SYS_BOOT) || kexec_load_disabled) return -EPERM; diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 45637511e0de..8f1454c3776a 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -329,6 +329,10 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, int ret = 0, i; struct kimage **dest_image, *image; + ret = arch_kexec_load(); + if (ret) + return ret; + /* We only trust the superuser with rebooting the system. */ if (!capable(CAP_SYS_BOOT) || kexec_load_disabled) return -EPERM; From patchwork Wed Dec 7 01:49:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BB50C352A1 for ; Wed, 7 Dec 2022 01:50:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229926AbiLGBum (ORCPT ); Tue, 6 Dec 2022 20:50:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229979AbiLGBuU (ORCPT ); Tue, 6 Dec 2022 20:50:20 -0500 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AD44537CE; Tue, 6 Dec 2022 17:50:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377803; x=1701913803; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sQNT2Cy2EmGfD2ueukiP5unO91F0G3BJKGWeML9T+YI=; b=B6oBsEh/LUWps9lKt/VqFNquuf7NbuwOtn1zPf+zvd0goLAENaYY1HuC hE5iP65lNQqFCkx6v0pZay4ds/r1fpBYNlsnYF5W7k1/JrhYi4z6bq2EF F1zV/sY3UUaJZs1I1Egw5QPw7pq5y31PGN47KQhuXGiflgpHIAB4P57Oo E5xW6bv2tQNsNVZo0ZuloOeL3LeRN368omdmFFOxNHBraKgQw5qOLIieu 2sM/4XoaDAChwOc5SusBMbUIhrZT/9Fxa051fHtI3DqVoyXmoNVORLRMi XXFi+tlXl1sy24U429RDfSq7brHDEsvJW3/DJWiGHxyIylO4H+exe4pIv Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="316794469" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="316794469" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:50:01 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="640082297" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="640082297" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:53 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id ADFAB109C90; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv8 12/14] x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub Date: Wed, 7 Dec 2022 04:49:31 +0300 Message-Id: <20221207014933.8435-13-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Memory acceptance requires a hypercall and one or multiple module calls. Make helpers for the calls available in boot stub. It has to accept memory where kernel image and initrd are placed. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 27 ------------------ arch/x86/include/asm/shared/tdx.h | 46 +++++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 19 ------------- 3 files changed, 46 insertions(+), 46 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index cfd4c95b9f04..12c14affa5f2 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -14,15 +14,6 @@ #include #include -/* TDX module Call Leaf IDs */ -#define TDX_GET_INFO 1 -#define TDX_GET_VEINFO 3 -#define TDX_GET_REPORT 4 -#define TDX_ACCEPT_PAGE 6 - -/* TDX hypercall Leaf IDs */ -#define TDVMCALL_MAP_GPA 0x10001 - /* MMIO direction */ #define EPT_READ 0 #define EPT_WRITE 1 @@ -45,24 +36,6 @@ #define TDREPORT_SUBTYPE_0 0 -/* - * Wrapper for standard use of __tdx_hypercall with no output aside from - * return code. - */ -static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) -{ - struct tdx_hypercall_args args = { - .r10 = TDX_HYPERCALL_STANDARD, - .r11 = fn, - .r12 = r12, - .r13 = r13, - .r14 = r14, - .r15 = r15, - }; - - return __tdx_hypercall(&args, 0); -} - /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void) { diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h index e53f26228fbb..c5f12b90ef70 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -13,6 +13,15 @@ #define TDX_CPUID_LEAF_ID 0x21 #define TDX_IDENT "IntelTDX " +/* TDX module Call Leaf IDs */ +#define TDX_GET_INFO 1 +#define TDX_GET_VEINFO 3 +#define TDX_GET_REPORT 5 +#define TDX_ACCEPT_PAGE 6 + +/* TDX hypercall Leaf IDs */ +#define TDVMCALL_MAP_GPA 0x10001 + #ifndef __ASSEMBLY__ /* @@ -33,8 +42,45 @@ struct tdx_hypercall_args { /* Used to request services from the VMM */ u64 __tdx_hypercall(struct tdx_hypercall_args *args, unsigned long flags); +/* + * Wrapper for standard use of __tdx_hypercall with no output aside from + * return code. + */ +static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) +{ + struct tdx_hypercall_args args = { + .r10 = TDX_HYPERCALL_STANDARD, + .r11 = fn, + .r12 = r12, + .r13 = r13, + .r14 = r14, + .r15 = r15, + }; + + return __tdx_hypercall(&args, 0); +} + + /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void); +/* + * Used in __tdx_module_call() to gather the output registers' values of the + * TDCALL instruction when requesting services from the TDX module. This is a + * software only structure and not part of the TDX module/VMM ABI + */ +struct tdx_module_output { + u64 rcx; + u64 rdx; + u64 r8; + u64 r9; + u64 r10; + u64 r11; +}; + +/* Used to communicate with the TDX module */ +u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 28d889c9aa16..234197ec17e4 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -20,21 +20,6 @@ #ifndef __ASSEMBLY__ -/* - * Used to gather the output registers values of the TDCALL and SEAMCALL - * instructions when requesting services from the TDX module. - * - * This is a software only structure and not part of the TDX module/VMM ABI. - */ -struct tdx_module_output { - u64 rcx; - u64 rdx; - u64 r8; - u64 r9; - u64 r10; - u64 r11; -}; - /* * Used by the #VE exception handler to gather the #VE exception * info from the TDX module. This is a software only structure @@ -55,10 +40,6 @@ struct ve_info { void __init tdx_early_init(void); -/* Used to communicate with the TDX module */ -u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, - struct tdx_module_output *out); - void tdx_get_ve_info(struct ve_info *ve); bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve); From patchwork Wed Dec 7 01:49:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 632714 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 185E5C352A1 for ; Wed, 7 Dec 2022 01:51:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230038AbiLGBvZ (ORCPT ); Tue, 6 Dec 2022 20:51:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230036AbiLGBum (ORCPT ); Tue, 6 Dec 2022 20:50:42 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AEC0537F4; Tue, 6 Dec 2022 17:50:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670377809; x=1701913809; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SfxCo98L571pSepCQHUd/DhlAMN+hwpTgH7KtlWdFyw=; b=cKBynRfvF+15QWSzbGOp+v7F/JLuA0Beqk0zwYWtIBQzXT/M7zrNeWiJ S9J1AbQOnKeHSG+JVRiR5q9wE8B1FRFxwYSGzTVzkfqBdDSvNYKXC3bIN ypfXdKsRdFyJCIeqtfexIq1/nN1fjhSZ87DPcvkivvNsyZ7OEeX5lXBED qOx6qTIwYcyWgykdMllOAbE0Zu71oveEqxDuYJHQo57glFwzc3uJalaOK pRYCxh2shatQpqn250AR1dcfQAOWc8M5vgfP6L3Vpkjl056fLwNCQVy6O L5mzEqJ8YKkHGLkiQS2wu/jCYuORaVOKr/jl/wXxFmr+OQtMR3QJ14tNY w==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="315494588" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="315494588" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:50:02 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="646427711" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="646427711" Received: from puneets1-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.38.123]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 17:49:53 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id B99B6109C91; Wed, 7 Dec 2022 04:49:39 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv8 13/14] x86/tdx: Refactor try_accept_one() Date: Wed, 7 Dec 2022 04:49:32 +0300 Message-Id: <20221207014933.8435-14-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> References: <20221207014933.8435-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Rework try_accept_one() to return accepted size instead of modifying 'start' inside the helper. It makes 'start' in-only argument and streamlines code on the caller side. Signed-off-by: Kirill A. Shutemov Suggested-by: Borislav Petkov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 12c14affa5f2..cf6d9a0968d8 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -674,18 +674,18 @@ static bool tdx_cache_flush_required(void) return true; } -static bool try_accept_one(phys_addr_t *start, unsigned long len, - enum pg_level pg_level) +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) { unsigned long accept_size = page_level_size(pg_level); u64 tdcall_rcx; u8 page_size; - if (!IS_ALIGNED(*start, accept_size)) - return false; + if (!IS_ALIGNED(start, accept_size)) + return 0; if (len < accept_size) - return false; + return 0; /* * Pass the page physical address to the TDX module to accept the @@ -704,15 +704,14 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len, page_size = 2; break; default: - return false; + return 0; } - tdcall_rcx = *start | page_size; + tdcall_rcx = start | page_size; if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) - return false; + return 0; - *start += accept_size; - return true; + return accept_size; } /* @@ -749,21 +748,22 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) */ while (start < end) { unsigned long len = end - start; + unsigned long accept_size; /* * Try larger accepts first. It gives chance to VMM to keep - * 1G/2M SEPT entries where possible and speeds up process by - * cutting number of hypercalls (if successful). + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). */ - if (try_accept_one(&start, len, PG_LEVEL_1G)) - continue; - - if (try_accept_one(&start, len, PG_LEVEL_2M)) - continue; - - if (!try_accept_one(&start, len, PG_LEVEL_4K)) + accept_size = try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) return false; + start += accept_size; } return true;