From patchwork Mon Apr 24 16:57:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 677009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 225D7C7618E for ; Mon, 24 Apr 2023 16:57:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230319AbjDXQ5u (ORCPT ); Mon, 24 Apr 2023 12:57:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231207AbjDXQ5t (ORCPT ); Mon, 24 Apr 2023 12:57:49 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D9078683; Mon, 24 Apr 2023 09:57:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B358B6270A; Mon, 24 Apr 2023 16:57:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AFE33C433EF; Mon, 24 Apr 2023 16:57:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682355461; bh=+4Gsdymv19vmn8hoQcaZQ5FlCwbe5D3nkuUdwGr3sXE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uNZinKz91CayYTEyw9M05IndlzScBNe9IHdWaBuTUc3XrNIwdt4ScWJPzPk0I0D1Z CbbNntlY3otWTJcGdkwxYDbWXCkJkViclstHuX7M0RvW48HxH5Il/SJv2tFsSqQ/VG d75UMGdOnvbwUs2bcWS/iP3ZhYNrIgden+SezaVvNJ6IbD1nyC7l+8kfng2eKKqKmk kKTUeSuJLXCyV77gfrWjZGS3Dhxz4kxPElOzITAyURC78JHOpOVBWnLgjl/JTOPbn+ rYxSyYmqbggVJNk7DkYHtv4ZzdjwgP0uw6c0+1MyCGGNiWsrAfXpjeQ197NmYT9qKj 5TzOUTWFD1BLw== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH 1/6] x86: decompressor: Move global symbol references to C code Date: Mon, 24 Apr 2023 18:57:21 +0200 Message-Id: <20230424165726.2245548-2-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230424165726.2245548-1-ardb@kernel.org> References: <20230424165726.2245548-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4975; i=ardb@kernel.org; h=from:subject; bh=+4Gsdymv19vmn8hoQcaZQ5FlCwbe5D3nkuUdwGr3sXE=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcVty7uCizdqbotOvJN6kVXm2OHVuxKfH7FkND+V7al2/ tVR1jPFHaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAit9gZ/mcdc19hYbddda3N +wqJmjCWmXnx/L2sJaWpGYq3fDyvLwOqsLwuJRWX3cm4UC1YqnlWtK6E8Zne+fMVLNMSVt+0DeU GAA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org We no longer have to be cautious when referring to global variables in the position independent decompressor code, now that we use PIE codegen and assert in the linker script that no GOT entries exist (which would require adjustment for the actual runtime load address of the decompressor binary). This means we can simply refer to global variables directly, instead of passing them into C routines from asm code. Let's do so for the code that will be called directly from the EFI stub after a subsequent patch, and avoid the need to pass these quantities directly. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_32.S | 8 -------- arch/x86/boot/compressed/head_64.S | 8 +------- arch/x86/boot/compressed/misc.c | 17 ++++++++++------- 3 files changed, 11 insertions(+), 22 deletions(-) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index 987ae727cf9f0d04..3ecc1bbe971e1d43 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -179,13 +179,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) */ /* push arguments for extract_kernel: */ - pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */ pushl %ebp /* output address */ - pushl input_len@GOTOFF(%ebx) /* input_len */ - leal input_data@GOTOFF(%ebx), %eax - pushl %eax /* input_data */ - leal boot_heap@GOTOFF(%ebx), %eax - pushl %eax /* heap area */ pushl %esi /* real mode pointer */ call extract_kernel /* returns kernel entry point in %eax */ addl $24, %esp @@ -213,8 +207,6 @@ SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end) */ .bss .balign 4 -boot_heap: - .fill BOOT_HEAP_SIZE, 1, 0 boot_stack: .fill BOOT_STACK_SIZE, 1, 0 boot_stack_end: diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 03c4328a88cbd5d0..5ab014be179c1103 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -564,11 +564,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) */ pushq %rsi /* Save the real mode argument */ movq %rsi, %rdi /* real mode address */ - leaq boot_heap(%rip), %rsi /* malloc area for uncompression */ - leaq input_data(%rip), %rdx /* input_data */ - movl input_len(%rip), %ecx /* input_len */ - movq %rbp, %r8 /* output target address */ - movl output_len(%rip), %r9d /* decompressed length, end of relocs */ + movq %rbp, %rsi /* output target address */ call extract_kernel /* returns kernel entry point in %rax */ popq %rsi @@ -726,8 +722,6 @@ SYM_DATA_END_LABEL(boot_idt, SYM_L_GLOBAL, boot_idt_end) */ .bss .balign 4 -SYM_DATA_LOCAL(boot_heap, .fill BOOT_HEAP_SIZE, 1, 0) - SYM_DATA_START_LOCAL(boot_stack) .fill BOOT_STACK_SIZE, 1, 0 .balign 16 diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 014ff222bf4b35c2..3b79667412ceb388 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -330,6 +330,11 @@ static size_t parse_elf(void *output) return ehdr.e_entry - LOAD_PHYSICAL_ADDR; } +static u8 boot_heap[BOOT_HEAP_SIZE] __aligned(4); + +extern unsigned char input_data[]; +extern unsigned int input_len, output_len; + /* * The compressed kernel image (ZO), has been moved so that its position * is against the end of the buffer used to hold the uncompressed kernel @@ -347,14 +352,12 @@ static size_t parse_elf(void *output) * |-------uncompressed kernel image---------| * */ -asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, - unsigned char *input_data, - unsigned long input_len, - unsigned char *output, - unsigned long output_len) +asmlinkage __visible void *extract_kernel(void *rmode, + unsigned char *output) { const unsigned long kernel_total_size = VO__end - VO__text; unsigned long virt_addr = LOAD_PHYSICAL_ADDR; + memptr heap = (memptr)boot_heap; unsigned long needed_size; size_t entry_offset; @@ -412,7 +415,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, * entries. This ensures the full mapped area is usable RAM * and doesn't include any reserved areas. */ - needed_size = max(output_len, kernel_total_size); + needed_size = max((unsigned long)output_len, kernel_total_size); #ifdef CONFIG_X86_64 needed_size = ALIGN(needed_size, MIN_KERNEL_ALIGN); #endif @@ -443,7 +446,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, #ifdef CONFIG_X86_64 if (heap > 0x3fffffffffffUL) error("Destination address too large"); - if (virt_addr + max(output_len, kernel_total_size) > KERNEL_IMAGE_SIZE) + if (virt_addr + needed_size > KERNEL_IMAGE_SIZE) error("Destination virtual address is beyond the kernel mapping area"); #else if (heap > ((-__PAGE_OFFSET-(128<<20)-1) & 0x7fffffff)) From patchwork Mon Apr 24 16:57:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 676726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52FE9C7618E for ; Mon, 24 Apr 2023 16:58:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232255AbjDXQ6A (ORCPT ); Mon, 24 Apr 2023 12:58:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232100AbjDXQ55 (ORCPT ); Mon, 24 Apr 2023 12:57:57 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24B169001; Mon, 24 Apr 2023 09:57:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9798862241; Mon, 24 Apr 2023 16:57:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D69EC4339E; Mon, 24 Apr 2023 16:57:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682355465; bh=80bWD8npnkuVdzVziNnGYFO9rJkT0AdFSNVqiqvnJ4w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E1RCSpsjHiZufMimPQXIKjI1Rd7JXm1nzXIjU7toafrFxGf14xUaSsVI/aU8VIhXZ ZVSb1E+hriEiFuChg/TyF8QAi2Z5Vh1OHfZBd2RgWwC2sYCJup/ArslKHadweS+x+T iNK2roTV3Likh8n8HFbiWRzGHkW31sZw4reoGshv9665aYovbzpw6nsn1B2wwb1OEw WczIpVy5Ix0ugaHFW1Ap93ZmKm6XwvmhgvkOB5ZSB+ZZz1uVbcMybR0KkcTB7I0G/s 2vf6DsT1NcsVKkM5y76RR2F++WAUyiaGTX8vch9IOu+B7NM2b5Ziba2CUmxVEcdeWJ wTJAFM5oFSCDg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH 2/6] x86: decompressor: Factor out kernel decompression and relocation Date: Mon, 24 Apr 2023 18:57:22 +0200 Message-Id: <20230424165726.2245548-3-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230424165726.2245548-1-ardb@kernel.org> References: <20230424165726.2245548-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2351; i=ardb@kernel.org; h=from:subject; bh=80bWD8npnkuVdzVziNnGYFO9rJkT0AdFSNVqiqvnJ4w=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcVty4eg9+zBkWuPvC9QTSiJrTg46/iut40+2dxbTRK4l s8P/1zeUcrCIMbBICumyCIw+++7nacnStU6z5KFmcPKBDKEgYtTACbCvYThn1YNx+8Vp3bP1g2N UH02oW6PrcKcuR4sp+6ss5YxibFhE2b4HyU253qBj/5N9gOTTvbJKt/w/Chy8jOrXUysrFH39XU tDAA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Factor out the decompressor sequence that invokes the decompressor, parses the ELF and applies the relocations so that we will be able to call it directly from the EFI stub. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/misc.c | 27 ++++++++++++++++---- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 3b79667412ceb388..0fa5df49ce14f196 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -330,11 +330,32 @@ static size_t parse_elf(void *output) return ehdr.e_entry - LOAD_PHYSICAL_ADDR; } +const unsigned long kernel_total_size = VO__end - VO__text; + static u8 boot_heap[BOOT_HEAP_SIZE] __aligned(4); extern unsigned char input_data[]; extern unsigned int input_len, output_len; +unsigned long decompress_kernel(unsigned char *outbuf, unsigned long virt_addr, + void (*error)(char *x)) +{ + unsigned long entry; + + if (!free_mem_ptr) { + free_mem_ptr = (unsigned long)boot_heap; + free_mem_end_ptr = (unsigned long)boot_heap + sizeof(boot_heap); + } + + __decompress(input_data, input_len, NULL, NULL, outbuf, output_len, + NULL, error); + + entry = parse_elf(outbuf); + handle_relocations(outbuf, output_len, virt_addr); + + return entry; +} + /* * The compressed kernel image (ZO), has been moved so that its position * is against the end of the buffer used to hold the uncompressed kernel @@ -355,7 +376,6 @@ extern unsigned int input_len, output_len; asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output) { - const unsigned long kernel_total_size = VO__end - VO__text; unsigned long virt_addr = LOAD_PHYSICAL_ADDR; memptr heap = (memptr)boot_heap; unsigned long needed_size; @@ -458,10 +478,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, #endif debug_putstr("\nDecompressing Linux... "); - __decompress(input_data, input_len, NULL, NULL, output, output_len, - NULL, error); - entry_offset = parse_elf(output); - handle_relocations(output, output_len, virt_addr); + entry_offset = decompress_kernel(output, virt_addr, error); debug_putstr("done.\nBooting the kernel (entry_offset: 0x"); debug_puthex(entry_offset); From patchwork Mon Apr 24 16:57:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 677008 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 453A9C77B73 for ; Mon, 24 Apr 2023 16:58:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232244AbjDXQ6E (ORCPT ); Mon, 24 Apr 2023 12:58:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232245AbjDXQ57 (ORCPT ); Mon, 24 Apr 2023 12:57:59 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A741AA26B; Mon, 24 Apr 2023 09:57:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6FF876272F; Mon, 24 Apr 2023 16:57:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6BD3AC4339C; Mon, 24 Apr 2023 16:57:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682355468; bh=r/rF9eDTXhHmlrkb2lsJlgyLhYQxh0Hseo352Wqu2+Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=my7xhwSkW6ZiIiIYMlrS49valDwR9XPU+7UEgHKhK/8yi+F/VZFrxbvy0m21Rc7sm UzcBnn2OUdXmfJ4H99y3qn4ZID7nbaIaGnjKLJ9PYECIMhsY2BSs9KbddcvGh5JxMC GHAyrpXVnRYUoSY0seRh/wCOra6PWFocfbMPEAw/SnQUWgCCWnUJRwyiZ5AinvrnIG xmzxf2LP13OI8ygnrZxWcnlnVz+dM9tzTPtHU7GnLy9+mDa5T4LYlOeFTG6Dh/Q+tZ 4Xq00RobvlP/RvKmVbo0rzqajnwNDc8cZxE4VUrVylliUVqdlxfrlroT37OZ2BRHuz Gp6YqaR2V4y6w== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH 3/6] x86: efistub: Obtain ACPI RSDP address while running in the stub Date: Mon, 24 Apr 2023 18:57:23 +0200 Message-Id: <20230424165726.2245548-4-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230424165726.2245548-1-ardb@kernel.org> References: <20230424165726.2245548-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1119; i=ardb@kernel.org; h=from:subject; bh=r/rF9eDTXhHmlrkb2lsJlgyLhYQxh0Hseo352Wqu2+Y=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcVty8dvM+dItfbu4b80N7Sy/OvTQLfZIn/5XKZK+fhsE I4Pj4/pKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABOZ3sfwv0DDddqvaRv+T7Rn XfjU2OCBZjDH2ZjnU9k+VFuctRSPVGP4p2tvZvfwamN53fRvFcl9th9lt1meX1z6okS+zzSTn/k LEwA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org One of the actions performed by the decompressor is populating the RSDP address field in the boot_params struct, and when doing EFI boot, EFI configuration tables are the preferred source for this information. In preparation for removing the decompressor code from the EFI stub boot path, set this field from the EFI stub code. Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/x86-stub.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index a0bfd31358ba97b1..e136c94037dda8d3 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -787,6 +787,11 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, efi_dxe_table = NULL; } + if (!boot_params->acpi_rsdp_addr) + boot_params->acpi_rsdp_addr = (unsigned long) + (get_efi_config_table(ACPI_20_TABLE_GUID) ?: + get_efi_config_table(ACPI_TABLE_GUID)); + /* * If the kernel isn't already loaded at a suitable address, * relocate it. From patchwork Mon Apr 24 16:57:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 676725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E4E6C77B61 for ; Mon, 24 Apr 2023 16:58:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231207AbjDXQ6J (ORCPT ); Mon, 24 Apr 2023 12:58:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231851AbjDXQ6A (ORCPT ); Mon, 24 Apr 2023 12:58:00 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89DC2A257; Mon, 24 Apr 2023 09:57:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 56A0362209; Mon, 24 Apr 2023 16:57:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4A960C4339E; Mon, 24 Apr 2023 16:57:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682355472; bh=x0BOo1C921LZfjOOg4Hj+pr1OqgyQhQHCvsWmPOfruY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sQ8P9uCtM7OuU48FixAiNcryK170nO5pGY/V56KSvsnYaCSYtRTxf/eOLBvWu6k0b LEiwfUxes8TxLN5115QmjVna2eitx4qSvPInknO3Xl9lxahp9v9n534t1R9TxqUSkq 1uzHoM7tg9s0BFUAylCAowMfquG9wnnLq8zn4cg3pNrnmkIN+YIdiKDTYVLQNm7FPD FXilJMaUjJGJeRqXlDvsPoDBk/7wy9H7t2wDd4lQbGoOc58qsAb2acAFu5TeFvoQq0 sM6OuYVB3V07rJ9EqGwnr4ady6nBJYxxqsBg90K7qtE9vY6RwvZauT935QqXg/w68o C1g+zoYML2OPQ== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH 4/6] x86: efistub: Perform 4/5 level paging switch from the stub Date: Mon, 24 Apr 2023 18:57:24 +0200 Message-Id: <20230424165726.2245548-5-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230424165726.2245548-1-ardb@kernel.org> References: <20230424165726.2245548-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6959; i=ardb@kernel.org; h=from:subject; bh=x0BOo1C921LZfjOOg4Hj+pr1OqgyQhQHCvsWmPOfruY=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcVty6e2k5f29c2Y9a/Q0OYlx3XH10tkXjVz+rmV7/jHu 1Fi4WOvjlIWBjEOBlkxRRaB2X/f7Tw9UarWeZYszBxWJpAhDFycAjCRHQsZGXaopp26uFQvLGmj 00G/W9ItLBxeB6/bdbMtuqvjm/19xkmGfwrTFj7YcCT0wMJwrap7XEef1yW67dY9IRX5yGX2ig2 dCewA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org In preparation for updating the EFI stub boot flow to avoid the bare metal decompressor code altogether, implement the support code for switching between 4 and 5 levels of paging before jumping to the kernel proper. Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/efi-stub-helper.c | 4 + drivers/firmware/efi/libstub/x86-stub.c | 145 ++++++++++++++++++++ 2 files changed, 149 insertions(+) diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c index 1e0203d74691ffcc..fc5f3b4c45e91401 100644 --- a/drivers/firmware/efi/libstub/efi-stub-helper.c +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c @@ -16,6 +16,8 @@ #include "efistub.h" +extern bool efi_no5lvl; + bool efi_nochunk; bool efi_nokaslr = !IS_ENABLED(CONFIG_RANDOMIZE_BASE); bool efi_novamap; @@ -73,6 +75,8 @@ efi_status_t efi_parse_options(char const *cmdline) efi_loglevel = CONSOLE_LOGLEVEL_QUIET; } else if (!strcmp(param, "noinitrd")) { efi_noinitrd = true; + } else if (IS_ENABLED(CONFIG_X86_64) && !strcmp(param, "no5lvl")) { + efi_no5lvl = true; } else if (!strcmp(param, "efi") && val) { efi_nochunk = parse_option_str(val, "nochunk"); efi_novamap |= parse_option_str(val, "novamap"); diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index e136c94037dda8d3..7b8717cbb96a1246 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -760,6 +760,139 @@ static efi_status_t exit_boot(struct boot_params *boot_params, void *handle) return EFI_SUCCESS; } +#ifdef CONFIG_X86_64 +bool efi_no5lvl; + +static const struct desc_struct gdt[] = { + [GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(0xc09b, 0, 0xfffff), + [GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(0xa09b, 0, 0xfffff), + [GDT_ENTRY_KERNEL_DS] = GDT_ENTRY_INIT(0xc093, 0, 0xfffff), +}; + +static void (*la57_toggle)(void *cr3, void *gdt); + +static void __naked tmpl_toggle(void *cr3, void *gdt) +{ + /* + * This is template code that will be copied into a 32-bit addressable + * buffer, allowing us to drop to 32-bit mode with paging disabled, + * which is required to be able to toggle the CR4.LA57 bit. + * + * The first MOVB instruction is only there to capture the size of the + * sequence, and implicitly, the offset to the LJMP's immediate, which + * will be populated with the correct absolute address after copying. + */ + asm("0: movb $(4f - .), %%al \n\t" + " lgdt (%%rsi) \n\t" + " movw %[ds], %%ax \n\t" + " movw %%ax, %%ds \n\t" + " movw %%ax, %%ss \n\t" + " leaq 2f(%%rip), %%rax \n\t" + " pushq %[cs32] \n\t" + " pushq %%rax \n\t" + " lretq \n\t" + "1: retq \n\t" + " .code32 \n\t" + "2: movl %%cr0, %%eax \n\t" + " btrl %[pg], %%eax \n\t" + " movl %%eax, %%cr0 \n\t" + " jmp 3f \n\t" + "3: movl %%cr4, %%ecx \n\t" + " btcl %[la57], %%ecx \n\t" + " movl %%ecx, %%cr4 \n\t" + " movl %%edi, %%cr3 \n\t" + " btsl %[pg], %%eax \n\t" + " movl %%eax, %%cr0 \n\t" + " ljmpl %[cs], $(1b - 0b) \n\t" + "4: .code64" + : + : [cs32] "i"(__KERNEL32_CS), + [cs] "i"(__KERNEL_CS), + [ds] "i"(__KERNEL_DS), + [pg] "i"(X86_CR0_PG_BIT), + [la57] "i"(X86_CR4_LA57_BIT)); +} + +/* + * Enabling (or disabling) 5 level paging is tricky, because it can only be + * done from 32-bit mode with paging disabled. This means not only that the + * code itself must be running from 32-bit addressable physical memory, but + * also that the root page table must be 32-bit addressable, as we cannot + * program a 64-bit value into CR3 when running in 32-bit mode. + */ +static efi_status_t efi_setup_5level_paging(void) +{ + const u8 tmpl_size = ((u8 *)tmpl_toggle)[1]; + efi_status_t status; + u8 *la57_code; + + if (!efi_is_64bit()) + return EFI_SUCCESS; + + /* check for 5 level paging support */ + if (native_cpuid_eax(0) < 7 || + !(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) + return EFI_SUCCESS; + + /* allocate some 32-bit addressable memory for code and a page table */ + status = efi_allocate_pages(2 * PAGE_SIZE, (unsigned long *)&la57_code, + U32_MAX); + if (status != EFI_SUCCESS) + return status; + + la57_toggle = memcpy(la57_code, tmpl_toggle, tmpl_size); + memset(la57_code + tmpl_size, 0x90, PAGE_SIZE - tmpl_size); + + /* + * To avoid having to allocate a 32-bit addressable stack, we use a + * ljmp to switch back to long mode. However, this takes an absolute + * address, so we have to poke it in at runtime. The dummy MOVB + * instruction at the beginning can be used to locate the immediate. + */ + *(u32 *)&la57_code[tmpl_size - 6] += (unsigned long)la57_code; + + adjust_memory_range_protection((unsigned long)la57_code, PAGE_SIZE); + + return EFI_SUCCESS; +} + +static void efi_5level_switch(void) +{ + bool want_la57 = IS_ENABLED(CONFIG_X86_5LEVEL) && !efi_no5lvl; + bool have_la57 = native_read_cr4() & X86_CR4_LA57; + bool need_toggle = want_la57 ^ have_la57; + u64 *pgt = (void *)la57_toggle + PAGE_SIZE; + u64 *cr3 = (u64 *)__native_read_cr3(); + struct desc_ptr desc; + u64 *new_cr3; + + if (!la57_toggle || !need_toggle) + return; + + if (!have_la57) { + /* + * We are going to enable 5 level paging, so we need to + * allocate a root level page from the 32-bit addressable + * physical region, and plug the existing hierarchy into it. + */ + new_cr3 = memset(pgt, 0, PAGE_SIZE); + new_cr3[0] = (u64)cr3 | _PAGE_TABLE_NOENC; + } else { + // take the new root table pointer from the current entry #0 + new_cr3 = (u64 *)(cr3[0] & PAGE_MASK); + + // copy the new root level table if it is not 32-bit addressable + if ((u64)new_cr3 > U32_MAX) + new_cr3 = memcpy(pgt, new_cr3, PAGE_SIZE); + } + + desc.size = sizeof(gdt) - 1; + desc.address = (u64)gdt; + + la57_toggle(new_cr3, &desc); +} +#endif + /* * On success, we return the address of startup_32, which has potentially been * relocated by efi_relocate_kernel. @@ -792,6 +925,14 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, (get_efi_config_table(ACPI_20_TABLE_GUID) ?: get_efi_config_table(ACPI_TABLE_GUID)); +#ifdef CONFIG_X86_64 + status = efi_setup_5level_paging(); + if (status != EFI_SUCCESS) { + efi_err("efi_setup_5level_paging() failed!\n"); + goto fail; + } +#endif + /* * If the kernel isn't already loaded at a suitable address, * relocate it. @@ -910,6 +1051,10 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, goto fail; } +#ifdef CONFIG_X86_64 + efi_5level_switch(); +#endif + return bzimage_addr; fail: efi_err("efi_main() failed!\n"); From patchwork Mon Apr 24 16:57:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 677007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36BA5C77B61 for ; Mon, 24 Apr 2023 16:58:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232272AbjDXQ6R (ORCPT ); Mon, 24 Apr 2023 12:58:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232222AbjDXQ6D (ORCPT ); Mon, 24 Apr 2023 12:58:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B644A7D89; Mon, 24 Apr 2023 09:57:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2D55862209; Mon, 24 Apr 2023 16:57:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 296C8C4339C; Mon, 24 Apr 2023 16:57:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682355476; bh=+UAx1b4wALU5+iWX8voL/AzXHT4rT38dEfcqRFfOVjw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=p06tU5UCRqKvj2ttM49GPkKyGllkbg+S7TjvhD4H3NoTsw6Pc5usF0D1J7UG1YvPo tXeNFlCeU4Gpu2bP6nSaUiJXJGDbOqOkFpv+fJLKFC+r4msnfvdy2hbJfL2Ak3bkEt Sl2XIt7Q/KteAWoQjxunBZVMiHpK8WTW2sIlDSR3DqQA54CNL3zBrkuvkzlJYGvxW2 v1svK4pCCw7/PyzsaPDFc7wEdIjyBWw7Q2HBDbZUsSgoVgiPn0rWIkE/UkF8RS8p+D Q5CpVHDh37Mx7gJ4HxftpNu+FOKTtKPCWuyqqFUNdjlGvEuLkT0/cRAHXYy0Cz4cFX c+u3Ur2VHjcpw== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH 5/6] x86: efistub: Prefer EFI memory attributes protocol over DXE services Date: Mon, 24 Apr 2023 18:57:25 +0200 Message-Id: <20230424165726.2245548-6-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230424165726.2245548-1-ardb@kernel.org> References: <20230424165726.2245548-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2947; i=ardb@kernel.org; h=from:subject; bh=+UAx1b4wALU5+iWX8voL/AzXHT4rT38dEfcqRFfOVjw=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcVtyxddNYUPXzIfuR9fYxqd/Kk4QSEinuNH7O6y+sUvf +S1T47uKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABM5OofhD1cCx4WPZ002xDtb FZStWOyw/Di74UfTCQ8ZZnTHf72uN5Phr+yU/vTIpuSDHBxn4x6qMM4tWBUrqO0x7WpwvzW3c0g NHwA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Currently, we rely on DXE services in some cases to clear non-execute restrictions from page allocations that need to be executable. This is dodgy, because DXE services are not specified by UEFI but by PI, and they are not intended for consumption by OS loaders. However, no alternative existed at the time. Now, there is a new UEFI protocol that should be used instead, so if it exists, prefer it over the DXE services calls. Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/x86-stub.c | 28 ++++++++++++++------ 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 7b8717cbb96a1246..ea4024a6a04e507f 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -25,6 +25,7 @@ const efi_system_table_t *efi_system_table; const efi_dxe_services_table_t *efi_dxe_table; u32 image_offset __section(".data"); static efi_loaded_image_t *image = NULL; +static efi_memory_attribute_protocol_t *memattr; static efi_status_t preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom) @@ -221,12 +222,18 @@ adjust_memory_range_protection(unsigned long start, unsigned long size) unsigned long rounded_start, rounded_end; unsigned long unprotect_start, unprotect_size; - if (efi_dxe_table == NULL) - return; - rounded_start = rounddown(start, EFI_PAGE_SIZE); rounded_end = roundup(start + size, EFI_PAGE_SIZE); + if (memattr != NULL) { + efi_call_proto(memattr, clear_memory_attributes, rounded_start, + rounded_end - rounded_start, EFI_MEMORY_XP); + return; + } + + if (efi_dxe_table == NULL) + return; + /* * Don't modify memory region attributes, they are * already suitable, to lower the possibility to @@ -913,13 +920,18 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, if (efi_system_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) efi_exit(handle, EFI_INVALID_PARAMETER); - efi_dxe_table = get_efi_config_table(EFI_DXE_SERVICES_TABLE_GUID); - if (efi_dxe_table && - efi_dxe_table->hdr.signature != EFI_DXE_SERVICES_TABLE_SIGNATURE) { - efi_warn("Ignoring DXE services table: invalid signature\n"); - efi_dxe_table = NULL; + if (IS_ENABLED(CONFIG_EFI_DXE_MEM_ATTRIBUTES)) { + efi_dxe_table = get_efi_config_table(EFI_DXE_SERVICES_TABLE_GUID); + if (efi_dxe_table && + efi_dxe_table->hdr.signature != EFI_DXE_SERVICES_TABLE_SIGNATURE) { + efi_warn("Ignoring DXE services table: invalid signature\n"); + efi_dxe_table = NULL; + } } + /* grab the memory attributes protocol if it exists */ + efi_bs_call(locate_protocol, &guid, NULL, (void **)&memattr); + if (!boot_params->acpi_rsdp_addr) boot_params->acpi_rsdp_addr = (unsigned long) (get_efi_config_table(ACPI_20_TABLE_GUID) ?: From patchwork Mon Apr 24 16:57:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 676724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A4C7C77B61 for ; Mon, 24 Apr 2023 16:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232271AbjDXQ6a (ORCPT ); Mon, 24 Apr 2023 12:58:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232097AbjDXQ6R (ORCPT ); Mon, 24 Apr 2023 12:58:17 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E93249001; Mon, 24 Apr 2023 09:58:01 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 33726615A0; Mon, 24 Apr 2023 16:58:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 07F4EC433D2; Mon, 24 Apr 2023 16:57:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682355480; bh=6MdVknqRTZnvXIoF+NO/nO50YczB4j7aGopG7v5Dbbk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bDR1h8UDSjErj4CCERVbn5VxtQZlrJSZaKADAMrqV1aCcLCIvdtvw56XFGptUEgzl ul496K7G2Nit80C039vw3QcSvfr3z/cHi2sAQhlFYC0xi4z6kCzEbQ54++AR1m0waQ aCoHXwDVAApXix2/8CChK9WX2R3GA4WRDApRewT8GhFjf5XAi4m4A9vnNHdhK7BYly ZM4G0QryTpG0+WWO3545IGJLp1+nnNdXXErOmQ8JXSRZ0DukCGMRq77P3cBG8Su0vf YT6hoVcg78Vc9nNfH/HwzeMqqbvAr8CzMq8kB0D1EUb7mwnYBFxgHECln5WmGzoBrn OF4MLrmWTTSCw== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH 6/6] x86: efistub: Avoid legacy decompressor when doing EFI boot Date: Mon, 24 Apr 2023 18:57:26 +0200 Message-Id: <20230424165726.2245548-7-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230424165726.2245548-1-ardb@kernel.org> References: <20230424165726.2245548-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=17551; i=ardb@kernel.org; h=from:subject; bh=6MdVknqRTZnvXIoF+NO/nO50YczB4j7aGopG7v5Dbbk=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcVty9fGgwd5z/LPXaN44djq8wrK26YfaXz2+zFX6ReFH fFlt6YbdZSyMIhxMMiKKbIIzP77bufpiVK1zrNkYeawMoEMYeDiFICJ7A5h+B8w1zDzk4PiytPv Zkx637R0RtBZ9htbG9gXnpyuNFlwzXwzRob7Z0tbz1VZv769SWz/zwvKWxN0XHymvk3l2bep2LZ WIZ8LAA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The bare metal decompressor code was never really intended to run in a hosted environment such as the EFI boot services, and does a few things that are problematic in the context of EFI boot now that the logo requirements are getting tighter. In particular, the decompressor moves itself around in memory, and assumes that arbitrary chunks of low memory can be temporarily emptied, populated with trampoline code and restored again afterwards, and this is difficult to support in a context where memory is not permitted to be mapped writable and executable at the same time or, at the very least, is mapped non-executable by default, and needs special treatment for this restriction to be lifted. Since EFI already maps all of memory 1:1, we don't need to create new page tables or handle page faults when decompressing the kernel. That means there is also no need to set up special exception handlers for SEV. Generally, there is little need to do anything that the decompressor does beyond - perform the 4/5 level paging switch, if needed, - decompress the kernel - relocate the kernel So let's do all of this from the EFI stub code, and avoid the bare metal decompressor altogether. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/efi_mixed.S | 55 ------ arch/x86/boot/compressed/head_32.S | 16 -- arch/x86/boot/compressed/head_64.S | 31 ---- arch/x86/include/asm/efi.h | 2 + drivers/firmware/efi/libstub/x86-stub.c | 188 ++++++++------------ 5 files changed, 75 insertions(+), 217 deletions(-) diff --git a/arch/x86/boot/compressed/efi_mixed.S b/arch/x86/boot/compressed/efi_mixed.S index 4ca70bf93dc0bdcd..075dd6410f9a4d60 100644 --- a/arch/x86/boot/compressed/efi_mixed.S +++ b/arch/x86/boot/compressed/efi_mixed.S @@ -245,10 +245,6 @@ SYM_FUNC_START(efi32_entry) jmp startup_32 SYM_FUNC_END(efi32_entry) -#define ST32_boottime 60 // offsetof(efi_system_table_32_t, boottime) -#define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol) -#define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base) - /* * efi_status_t efi32_pe_entry(efi_handle_t image_handle, * efi_system_table_32_t *sys_table) @@ -256,8 +252,6 @@ SYM_FUNC_END(efi32_entry) SYM_FUNC_START(efi32_pe_entry) pushl %ebp movl %esp, %ebp - pushl %eax // dummy push to allocate loaded_image - pushl %ebx // save callee-save registers pushl %edi @@ -266,48 +260,8 @@ SYM_FUNC_START(efi32_pe_entry) movl $0x80000003, %eax // EFI_UNSUPPORTED jnz 2f - call 1f -1: pop %ebx - - /* Get the loaded image protocol pointer from the image handle */ - leal -4(%ebp), %eax - pushl %eax // &loaded_image - leal (loaded_image_proto - 1b)(%ebx), %eax - pushl %eax // pass the GUID address - pushl 8(%ebp) // pass the image handle - - /* - * Note the alignment of the stack frame. - * sys_table - * handle <-- 16-byte aligned on entry by ABI - * return address - * frame pointer - * loaded_image <-- local variable - * saved %ebx <-- 16-byte aligned here - * saved %edi - * &loaded_image - * &loaded_image_proto - * handle <-- 16-byte aligned for call to handle_protocol - */ - - movl 12(%ebp), %eax // sys_table - movl ST32_boottime(%eax), %eax // sys_table->boottime - call *BS32_handle_protocol(%eax) // sys_table->boottime->handle_protocol - addl $12, %esp // restore argument space - testl %eax, %eax - jnz 2f - movl 8(%ebp), %ecx // image_handle movl 12(%ebp), %edx // sys_table - movl -4(%ebp), %esi // loaded_image - movl LI32_image_base(%esi), %esi // loaded_image->image_base - leal (startup_32 - 1b)(%ebx), %ebp // runtime address of startup_32 - /* - * We need to set the image_offset variable here since startup_32() will - * use it before we get to the 64-bit efi_pe_entry() in C code. - */ - subl %esi, %ebp // calculate image_offset - movl %ebp, (image_offset - 1b)(%ebx) // save image_offset xorl %esi, %esi jmp efi32_entry // pass %ecx, %edx, %esi // no other registers remain live @@ -318,15 +272,6 @@ SYM_FUNC_START(efi32_pe_entry) RET SYM_FUNC_END(efi32_pe_entry) - .section ".rodata" - /* EFI loaded image protocol GUID */ - .balign 4 -SYM_DATA_START_LOCAL(loaded_image_proto) - .long 0x5b1b31a1 - .word 0x9562, 0x11d2 - .byte 0x8e, 0x3f, 0x00, 0xa0, 0xc9, 0x69, 0x72, 0x3b -SYM_DATA_END(loaded_image_proto) - .data .balign 8 SYM_DATA_START_LOCAL(efi32_boot_gdt) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index 3ecc1bbe971e1d43..a578fe178f1cfee7 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -84,19 +84,6 @@ SYM_FUNC_START(startup_32) #ifdef CONFIG_RELOCATABLE leal startup_32@GOTOFF(%edx), %ebx - -#ifdef CONFIG_EFI_STUB -/* - * If we were loaded via the EFI LoadImage service, startup_32() will be at an - * offset to the start of the space allocated for the image. efi_pe_entry() will - * set up image_offset to tell us where the image actually starts, so that we - * can use the full available buffer. - * image_offset = startup_32 - image_base - * Otherwise image_offset will be zero and has no effect on the calculations. - */ - subl image_offset@GOTOFF(%edx), %ebx -#endif - movl BP_kernel_alignment(%esi), %eax decl %eax addl %eax, %ebx @@ -153,10 +140,7 @@ SYM_FUNC_END(startup_32) #ifdef CONFIG_EFI_STUB SYM_FUNC_START(efi32_stub_entry) add $0x4, %esp - movl 8(%esp), %esi /* save boot_params pointer */ call efi_main - /* efi_main returns the possibly relocated address of startup_32 */ - jmp *%eax SYM_FUNC_END(efi32_stub_entry) SYM_FUNC_ALIAS(efi_stub_entry, efi32_stub_entry) #endif diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 5ab014be179c1103..eb8ead7bce9fc577 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -146,19 +146,6 @@ SYM_FUNC_START(startup_32) #ifdef CONFIG_RELOCATABLE movl %ebp, %ebx - -#ifdef CONFIG_EFI_STUB -/* - * If we were loaded via the EFI LoadImage service, startup_32 will be at an - * offset to the start of the space allocated for the image. efi_pe_entry will - * set up image_offset to tell us where the image actually starts, so that we - * can use the full available buffer. - * image_offset = startup_32 - image_base - * Otherwise image_offset will be zero and has no effect on the calculations. - */ - subl rva(image_offset)(%ebp), %ebx -#endif - movl BP_kernel_alignment(%esi), %eax decl %eax addl %eax, %ebx @@ -346,20 +333,6 @@ SYM_CODE_START(startup_64) /* Start with the delta to where the kernel will run at. */ #ifdef CONFIG_RELOCATABLE leaq startup_32(%rip) /* - $startup_32 */, %rbp - -#ifdef CONFIG_EFI_STUB -/* - * If we were loaded via the EFI LoadImage service, startup_32 will be at an - * offset to the start of the space allocated for the image. efi_pe_entry will - * set up image_offset to tell us where the image actually starts, so that we - * can use the full available buffer. - * image_offset = startup_32 - image_base - * Otherwise image_offset will be zero and has no effect on the calculations. - */ - movl image_offset(%rip), %eax - subq %rax, %rbp -#endif - movl BP_kernel_alignment(%rsi), %eax decl %eax addq %rax, %rbp @@ -529,11 +502,7 @@ SYM_CODE_END(startup_64) #endif SYM_FUNC_START(efi64_stub_entry) and $~0xf, %rsp /* realign the stack */ - movq %rdx, %rbx /* save boot_params pointer */ call efi_main - movq %rbx,%rsi - leaq rva(startup_64)(%rax), %rax - jmp *%rax SYM_FUNC_END(efi64_stub_entry) SYM_FUNC_ALIAS(efi_stub_entry, efi64_stub_entry) #endif diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h index 419280d263d2e3f2..36b848e7d088a59b 100644 --- a/arch/x86/include/asm/efi.h +++ b/arch/x86/include/asm/efi.h @@ -216,6 +216,8 @@ efi_status_t efi_set_virtual_address_map(unsigned long memory_map_size, #ifdef CONFIG_EFI_MIXED +#define EFI_ALLOC_LIMIT (efi_is_64bit() ? ULONG_MAX : U32_MAX) + #define ARCH_HAS_EFISTUB_WRAPPERS static inline bool efi_is_64bit(void) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index ea4024a6a04e507f..dc45bcb74d1552c2 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -18,12 +18,8 @@ #include "efistub.h" -/* Maximum physical address for 64-bit kernel with 4-level paging */ -#define MAXMEM_X86_64_4LEVEL (1ull << 46) - const efi_system_table_t *efi_system_table; const efi_dxe_services_table_t *efi_dxe_table; -u32 image_offset __section(".data"); static efi_loaded_image_t *image = NULL; static efi_memory_attribute_protocol_t *memattr; @@ -274,54 +270,9 @@ adjust_memory_range_protection(unsigned long start, unsigned long size) } } -/* - * Trampoline takes 2 pages and can be loaded in first megabyte of memory - * with its end placed between 128k and 640k where BIOS might start. - * (see arch/x86/boot/compressed/pgtable_64.c) - * - * We cannot find exact trampoline placement since memory map - * can be modified by UEFI, and it can alter the computed address. - */ - -#define TRAMPOLINE_PLACEMENT_BASE ((128 - 8)*1024) -#define TRAMPOLINE_PLACEMENT_SIZE (640*1024 - (128 - 8)*1024) - -void startup_32(struct boot_params *boot_params); - -static void -setup_memory_protection(unsigned long image_base, unsigned long image_size) -{ - /* - * Allow execution of possible trampoline used - * for switching between 4- and 5-level page tables - * and relocated kernel image. - */ - - adjust_memory_range_protection(TRAMPOLINE_PLACEMENT_BASE, - TRAMPOLINE_PLACEMENT_SIZE); - -#ifdef CONFIG_64BIT - if (image_base != (unsigned long)startup_32) - adjust_memory_range_protection(image_base, image_size); -#else - /* - * Clear protection flags on a whole range of possible - * addresses used for KASLR. We don't need to do that - * on x86_64, since KASLR/extraction is performed after - * dedicated identity page tables are built and we only - * need to remove possible protection on relocated image - * itself disregarding further relocations. - */ - adjust_memory_range_protection(LOAD_PHYSICAL_ADDR, - KERNEL_IMAGE_SIZE - LOAD_PHYSICAL_ADDR); -#endif -} - static const efi_char16_t apple[] = L"Apple"; -static void setup_quirks(struct boot_params *boot_params, - unsigned long image_base, - unsigned long image_size) +static void setup_quirks(struct boot_params *boot_params) { efi_char16_t *fw_vendor = (efi_char16_t *)(unsigned long) efi_table_attr(efi_system_table, fw_vendor); @@ -330,9 +281,6 @@ static void setup_quirks(struct boot_params *boot_params, if (IS_ENABLED(CONFIG_APPLE_PROPERTIES)) retrieve_apple_device_properties(boot_params); } - - if (IS_ENABLED(CONFIG_EFI_DXE_MEM_ATTRIBUTES)) - setup_memory_protection(image_base, image_size); } /* @@ -485,7 +433,6 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, } image_base = efi_table_attr(image, image_base); - image_offset = (void *)startup_32 - image_base; status = efi_allocate_pages(sizeof(struct boot_params), (unsigned long *)&boot_params, ULONG_MAX); @@ -900,19 +847,78 @@ static void efi_5level_switch(void) } #endif +static void efi_get_seed(void *seed, int size) +{ + efi_get_random_bytes(size, seed); + + // TODO mix in RDRAND/TSC +} + +static void error(char *str) +{ + efi_warn("Decompression failed: %s\n", str); +} + +int decompress_kernel(unsigned char *outbuf, unsigned long virt_addr, + void (*error)(char *x)); + +static efi_status_t efi_decompress_kernel(unsigned long *kernel_entry) +{ + unsigned long virt_addr = LOAD_PHYSICAL_ADDR; + extern const unsigned long kernel_total_size; + extern unsigned int input_len, output_len; + extern void *input_data; + unsigned long addr, alloc_size, entry; + unsigned long seed[2] = {}; + efi_status_t status; + + /* determine the required size of the allocation */ + alloc_size = ALIGN(max((unsigned long)output_len, kernel_total_size), + MIN_KERNEL_ALIGN); + + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && !efi_nokaslr) { + u64 range = KERNEL_IMAGE_SIZE - LOAD_PHYSICAL_ADDR - kernel_total_size; + + efi_get_seed(seed, sizeof(seed)); + + virt_addr += (range * (u32)seed[1]) >> 32; + virt_addr &= ~(CONFIG_PHYSICAL_ALIGN - 1); + } + + status = efi_random_alloc(alloc_size, CONFIG_PHYSICAL_ALIGN, &addr, + seed[0], EFI_LOADER_CODE); + if (status != EFI_SUCCESS) + return status; + + entry = decompress_kernel((void *)addr, virt_addr, error); + *kernel_entry = addr + entry; + + adjust_memory_range_protection(addr, kernel_total_size); + + return EFI_SUCCESS; +} + +static void __noreturn enter_kernel(unsigned long kernel_addr, + struct boot_params *boot_params) +{ + /* enter decompressed kernel with boot_params pointer in RSI/ESI */ + asm("jmp *%0"::"r"(kernel_addr), "S"(boot_params)); + + unreachable(); +} + /* - * On success, we return the address of startup_32, which has potentially been - * relocated by efi_relocate_kernel. - * On failure, we exit to the firmware via efi_exit instead of returning. + * On success, we jump to the relocated kernel directly and never return; + * on failure, we exit to the firmware via efi_exit instead of returning. */ asmlinkage unsigned long efi_main(efi_handle_t handle, efi_system_table_t *sys_table_arg, struct boot_params *boot_params) { - unsigned long bzimage_addr = (unsigned long)startup_32; - unsigned long buffer_start, buffer_end; + efi_guid_t guid = EFI_MEMORY_ATTRIBUTE_PROTOCOL_GUID; struct setup_header *hdr = &boot_params->hdr; const struct linux_efi_initrd *initrd = NULL; + unsigned long kernel_entry; efi_status_t status; efi_system_table = sys_table_arg; @@ -945,60 +951,6 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, } #endif - /* - * If the kernel isn't already loaded at a suitable address, - * relocate it. - * - * It must be loaded above LOAD_PHYSICAL_ADDR. - * - * The maximum address for 64-bit is 1 << 46 for 4-level paging. This - * is defined as the macro MAXMEM, but unfortunately that is not a - * compile-time constant if 5-level paging is configured, so we instead - * define our own macro for use here. - * - * For 32-bit, the maximum address is complicated to figure out, for - * now use KERNEL_IMAGE_SIZE, which will be 512MiB, the same as what - * KASLR uses. - * - * Also relocate it if image_offset is zero, i.e. the kernel wasn't - * loaded by LoadImage, but rather by a bootloader that called the - * handover entry. The reason we must always relocate in this case is - * to handle the case of systemd-boot booting a unified kernel image, - * which is a PE executable that contains the bzImage and an initrd as - * COFF sections. The initrd section is placed after the bzImage - * without ensuring that there are at least init_size bytes available - * for the bzImage, and thus the compressed kernel's startup code may - * overwrite the initrd unless it is moved out of the way. - */ - - buffer_start = ALIGN(bzimage_addr - image_offset, - hdr->kernel_alignment); - buffer_end = buffer_start + hdr->init_size; - - if ((buffer_start < LOAD_PHYSICAL_ADDR) || - (IS_ENABLED(CONFIG_X86_32) && buffer_end > KERNEL_IMAGE_SIZE) || - (IS_ENABLED(CONFIG_X86_64) && buffer_end > MAXMEM_X86_64_4LEVEL) || - (image_offset == 0)) { - extern char _bss[]; - - status = efi_relocate_kernel(&bzimage_addr, - (unsigned long)_bss - bzimage_addr, - hdr->init_size, - hdr->pref_address, - hdr->kernel_alignment, - LOAD_PHYSICAL_ADDR); - if (status != EFI_SUCCESS) { - efi_err("efi_relocate_kernel() failed!\n"); - goto fail; - } - /* - * Now that we've copied the kernel elsewhere, we no longer - * have a set up block before startup_32(), so reset image_offset - * to zero in case it was set earlier. - */ - image_offset = 0; - } - #ifdef CONFIG_CMDLINE_BOOL status = efi_parse_options(CONFIG_CMDLINE); if (status != EFI_SUCCESS) { @@ -1016,6 +968,12 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, } } + status = efi_decompress_kernel(&kernel_entry); + if (status != EFI_SUCCESS) { + efi_err("Failed to decompress kernel\n"); + goto fail; + } + /* * At this point, an initrd may already have been loaded by the * bootloader and passed via bootparams. We permit an initrd loaded @@ -1055,7 +1013,7 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, setup_efi_pci(boot_params); - setup_quirks(boot_params, bzimage_addr, buffer_end - buffer_start); + setup_quirks(boot_params); status = exit_boot(boot_params, handle); if (status != EFI_SUCCESS) { @@ -1067,7 +1025,7 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, efi_5level_switch(); #endif - return bzimage_addr; + enter_kernel(kernel_entry, boot_params); fail: efi_err("efi_main() failed!\n");