From patchwork Mon May 8 07:03:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 779D7C77B75 for ; Mon, 8 May 2023 07:03:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232542AbjEHHD4 (ORCPT ); Mon, 8 May 2023 03:03:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232525AbjEHHDs (ORCPT ); Mon, 8 May 2023 03:03:48 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E4951729; Mon, 8 May 2023 00:03:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0BDF361F85; Mon, 8 May 2023 07:03:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AD105C4339B; Mon, 8 May 2023 07:03:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529424; bh=9BDwSgqeJ0SjBYTlcw8Hj/TbZ+wo730AVxbO98qoyM4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ly4rZSXwCDf9rrQihuUSa1Bj+hNk31Ta6bmJEg9Dszga718CieEWys2FjTBX2HtCr eip1iHqzg4eIKgESfYPGUt6hwVSWYDcccrdfhzwGFZJdncOskRK5UUovPkJA6U8gEe iFWOqIMLkQw3cUKrioQwkDgWo5hdn56voCqLeLbHGMRjfHpwDq1BwneqrtQKpDNTJQ WHiP9meopiMUy6pVeDRppdn+xSHQ3/Sci9cKRXHfqr2/dREbsDXZT+Wa5aK0P5/7TP fYdmI/Ri0RgjKlWwRoW+HQDqbfaTIxrVVL8LzZZPPehTgXzCUx6a+Fpafi2q//a8Xz Dl+HriO/cuepg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 01/20] x86: decompressor: Use proper sequence to take the address of the GOT Date: Mon, 8 May 2023 09:03:11 +0200 Message-Id: <20230508070330.582131-2-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1829; i=ardb@kernel.org; h=from:subject; bh=9BDwSgqeJ0SjBYTlcw8Hj/TbZ+wo730AVxbO98qoyM4=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3qrPbM1OE/6eqf17dOH1jBe/cuIE1l3daLzc0URiV +nDcw43O0pZGMQ4GGTFFFkEZv99t/P0RKla51myMHNYmUCGMHBxCsBEzC8y/NNiPTxr97RTmUdL Li+ayXv/2YMQ7/25cXN3eX1hqst3upHHyDBBRu/AHOUzyWsT3zHyTti1p+fNDd/rzgefpjW+z2z /HM4LAA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org We don't actually use a global offset table (GOT) in the 32-bit decompressor, but as is common for 32-bit position independent code, we use the magic symbol _GLOBAL_OFFSET_TABLE_ as an anchor from which to derive the actual runtime addresses of other symbols, using special @GOTOFF symbol references that are resolved at link time, and populated with the distance between the address of the magic _GLOBAL_OFFSET_TABLE_ anchor and the address of the symbol in question. This means _GLOBAL_OFFSET_TABLE_ is the only symbol whose actual runtime address we have to determine explicitly, which is one of the first things we do in startup_32. However, we do so by taking the absolute address via the immediate field of an ADD instruction (plus a small offset), and taking absolute addresses that need to be resolved at link time is what we are trying to avoid. Fortunately, the assembler knows that _GLOBAL_OFFSET_TABLE_ is magic, and emits a special relative relocation instead, and so the resulting code works as expected. However, this is not obvious for someone reading the code, and the use of LEA with an explicit relative addend is more idiomatic so use that instead. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_32.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index 987ae727cf9f0d04..53cbee1e2a93efce 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -58,7 +58,7 @@ SYM_FUNC_START(startup_32) leal (BP_scratch+4)(%esi), %esp call 1f 1: popl %edx - addl $_GLOBAL_OFFSET_TABLE_+(.-1b), %edx + leal (_GLOBAL_OFFSET_TABLE_ - 1b)(%edx), %edx /* Load new GDT */ leal gdt@GOTOFF(%edx), %eax From patchwork Mon May 8 07:03:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B75CC7EE24 for ; Mon, 8 May 2023 07:03:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232580AbjEHHD5 (ORCPT ); Mon, 8 May 2023 03:03:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232603AbjEHHDz (ORCPT ); Mon, 8 May 2023 03:03:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C734C19D64; Mon, 8 May 2023 00:03:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4312B61F85; Mon, 8 May 2023 07:03:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E1456C433A4; Mon, 8 May 2023 07:03:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529428; bh=4E1YJRld28apgAdM8nPpbwbWdnTo7VrUtFu79939OCI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OC7T2hmD4SUQJNpGLRKFJBTqnKj5rSfg3QDg5/J2QH9xMXddP8M7Z32tCnLfimKwi +DjrOxfVKTNTyCv6sxHCo/dDXd/YznpsfaAjg4Rpn803lZXmOH2s+mpZte2crJjlzz qeYo92gLnYySd4YkcMSXutGtfoaeoPE89M0y1n0CPqvlPxd2eHlCWTWi4wCpW/38il 3F3S81nfSnvEgmrxwzFYb4K4HpW0YmMBlonuxs8Vb77lfCOuGSJK1SJAwEP/1FluxL o4mwf6wBzLS8mntwNf6WzNEO+slC0LMPN6v7YA6JZWUeSBh2LzCzr480E5+6Oc9DO3 Ksi+ntgRPRaHg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 02/20] x86: decompressor: Store boot_params pointer in callee save register Date: Mon, 8 May 2023 09:03:12 +0200 Message-Id: <20230508070330.582131-3-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3624; i=ardb@kernel.org; h=from:subject; bh=4E1YJRld28apgAdM8nPpbwbWdnTo7VrUtFu79939OCI=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3uorpjprWVU+T3hw/XlLROxeqbsLJJ29rs+7/OOnw 012Tb91HaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAiS5IZ/umrP+i8OFH9zZoU RcmtNnYbXuaduOvewFCr8fnsna2mF1cw/E+7+eeO4NMKp0M2enocN/nF63wOzWBjTjjiu37G4eo 9OxgA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Instead of pushing and popping %RSI umpteen times to preserve the struct boot_params pointer across the execution of the startup code, just move it into a callee save register before the first call into C, and copy it back when needed. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_64.S | 34 +++++++------------- 1 file changed, 11 insertions(+), 23 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 03c4328a88cbd5d0..59340e533dff8369 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -416,10 +416,14 @@ SYM_CODE_START(startup_64) lretq .Lon_kernel_cs: + /* + * RSI holds a pointer to a boot_params structure provided by the + * loader, and this needs to be preserved across C function calls. So + * move it into a callee saved register. + */ + movq %rsi, %r15 - pushq %rsi call load_stage1_idt - popq %rsi #ifdef CONFIG_AMD_MEM_ENCRYPT /* @@ -432,10 +436,8 @@ SYM_CODE_START(startup_64) * detection/setup to ensure that has been done in advance of any dependent * code. */ - pushq %rsi - movq %rsi, %rdi /* real mode address */ + movq %r15, %rdi /* pass struct boot_params pointer */ call sev_enable - popq %rsi #endif /* @@ -448,13 +450,9 @@ SYM_CODE_START(startup_64) * - Non zero RDX means trampoline needs to enable 5-level * paging. * - * RSI holds real mode data and needs to be preserved across - * this function call. */ - pushq %rsi - movq %rsi, %rdi /* real mode address */ + movq %r15, %rdi /* pass struct boot_params pointer */ call paging_prepare - popq %rsi /* Save the trampoline address in RCX */ movq %rax, %rcx @@ -479,14 +477,9 @@ trampoline_return: * * RDI is address of the page table to use instead of page table * in trampoline memory (if required). - * - * RSI holds real mode data and needs to be preserved across - * this function call. */ - pushq %rsi leaq rva(top_pgtable)(%rbx), %rdi call cleanup_trampoline - popq %rsi /* Zero EFLAGS */ pushq $0 @@ -496,7 +489,6 @@ trampoline_return: * Copy the compressed kernel to the end of our buffer * where decompression in place becomes safe. */ - pushq %rsi leaq (_bss-8)(%rip), %rsi leaq rva(_bss-8)(%rbx), %rdi movl $(_bss - startup_32), %ecx @@ -504,7 +496,6 @@ trampoline_return: std rep movsq cld - popq %rsi /* * The GDT may get overwritten either during the copy we just did or @@ -551,30 +542,27 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) shrq $3, %rcx rep stosq - pushq %rsi call load_stage2_idt /* Pass boot_params to initialize_identity_maps() */ - movq (%rsp), %rdi + movq %r15, %rdi /* pass struct boot_params pointer */ call initialize_identity_maps - popq %rsi /* * Do the extraction, and jump to the new kernel.. */ - pushq %rsi /* Save the real mode argument */ - movq %rsi, %rdi /* real mode address */ + movq %r15, %rdi /* pass struct boot_params pointer */ leaq boot_heap(%rip), %rsi /* malloc area for uncompression */ leaq input_data(%rip), %rdx /* input_data */ movl input_len(%rip), %ecx /* input_len */ movq %rbp, %r8 /* output target address */ movl output_len(%rip), %r9d /* decompressed length, end of relocs */ call extract_kernel /* returns kernel entry point in %rax */ - popq %rsi /* * Jump to the decompressed kernel. */ + movq %r15, %rsi jmp *%rax SYM_FUNC_END(.Lrelocated) From patchwork Mon May 8 07:03:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41394C77B73 for ; Mon, 8 May 2023 07:04:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232011AbjEHHD7 (ORCPT ); Mon, 8 May 2023 03:03:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232428AbjEHHDz (ORCPT ); Mon, 8 May 2023 03:03:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7F951A112; Mon, 8 May 2023 00:03:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7BF4561F8C; Mon, 8 May 2023 07:03:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24313C4339E; Mon, 8 May 2023 07:03:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529432; bh=UygRvsEzWeCdWneVvfD1K/VpCrMByhROcGCs3aNJ6SY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lpAxgN8jqzCo90JQr6MwZgUoWPmP3Kg/6DHBmLJLkpYmcJWqdRuCM2wcnmZ5RXMqG YE1Rr3N4tB85eff+lpdEJ3a8IKboGiKb1YoxKXFNbMHTCAHrzkoLr6IEpPHy5+s5Tq DTE5JB6xe/KTeqWQ70Km+Zo/QgdBWRq7OzRu9RYpsXvBCAagRq5jXtA1uBJO1g9/hY P32HfVkZLM89fFmDcjcwSGOSXoMjGT3OyBa86VrYSkChJ5lh/vxndlbWkkGXWJkhOx Y5WcKS3wYKlZQ5u0QLfr0R8C2UCAd3U03xKcCyItmw/c5meBsZJHCZuIJ0TUZiqIHL U/5K6vt3TFGvw== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 03/20] x86: decompressor: Call trampoline as a normal function Date: Mon, 8 May 2023 09:03:13 +0200 Message-Id: <20230508070330.582131-4-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2486; i=ardb@kernel.org; h=from:subject; bh=UygRvsEzWeCdWneVvfD1K/VpCrMByhROcGCs3aNJ6SY=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3hqjT1yTvBb/VbE9LOGpfbNW9DJD75bcqO0rryvO/ eMgefN0RykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZjIF1dGhvaPc3t2PrXfzBmQ o1u2Uu6YSaVGdOzq80v21i9d+PyBXDIjw5UvBi5rpYt5My4cyfjMHXNZ4sAJh19SxU6cKoZz0z9 1MgEA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Move the long return to switch to 32-bit mode into the trampoline code so we can call it as an ordinary function. This will allow us to call it directly from C code in a subsequent patch. Signed-off-by: Ard Biesheuvel Acked-by: Kirill A. Shutemov --- arch/x86/boot/compressed/head_64.S | 25 +++++++++----------- arch/x86/boot/compressed/pgtable.h | 2 +- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 59340e533dff8369..81b53b576cdd05ae 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -457,18 +457,9 @@ SYM_CODE_START(startup_64) /* Save the trampoline address in RCX */ movq %rax, %rcx - /* - * Load the address of trampoline_return() into RDI. - * It will be used by the trampoline to return to the main code. - */ - leaq trampoline_return(%rip), %rdi - - /* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */ - pushq $__KERNEL32_CS leaq TRAMPOLINE_32BIT_CODE_OFFSET(%rax), %rax - pushq %rax - lretq -trampoline_return: + call *%rax + /* Restore the stack, the 32-bit trampoline uses its own stack */ leaq rva(boot_stack_end)(%rbx), %rsp @@ -566,16 +557,22 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) jmp *%rax SYM_FUNC_END(.Lrelocated) - .code32 /* * This is the 32-bit trampoline that will be copied over to low memory. * - * RDI contains the return address (might be above 4G). * ECX contains the base address of the trampoline memory. * Non zero RDX means trampoline needs to enable 5-level paging. */ SYM_CODE_START(trampoline_32bit_src) - /* Set up data and stack segments */ + popq %rdi + /* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */ + pushq $__KERNEL32_CS + leaq 0f(%rip), %rax + pushq %rax + lretq + + .code32 +0: /* Set up data and stack segments */ movl $__KERNEL_DS, %eax movl %eax, %ds movl %eax, %ss diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h index cc9b2529a08634b4..91dbb99203fbce2d 100644 --- a/arch/x86/boot/compressed/pgtable.h +++ b/arch/x86/boot/compressed/pgtable.h @@ -6,7 +6,7 @@ #define TRAMPOLINE_32BIT_PGTABLE_OFFSET 0 #define TRAMPOLINE_32BIT_CODE_OFFSET PAGE_SIZE -#define TRAMPOLINE_32BIT_CODE_SIZE 0x80 +#define TRAMPOLINE_32BIT_CODE_SIZE 0xA0 #define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE From patchwork Mon May 8 07:03:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44B3AC77B73 for ; Mon, 8 May 2023 07:04:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232845AbjEHHEE (ORCPT ); Mon, 8 May 2023 03:04:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232699AbjEHHEA (ORCPT ); Mon, 8 May 2023 03:04:00 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A0061A115; Mon, 8 May 2023 00:03:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B28AC61F8A; Mon, 8 May 2023 07:03:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58D98C4339C; Mon, 8 May 2023 07:03:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529437; bh=rxt+/g9xwfTqH/LFwaurDzU1JjHkdOgxvezFD/2D/FU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JInth/2qmJqlZYbTRm+OBct7kDHZzhMMZq0zby9msXFOUslVWYir79Oc3cBP/fqIg tHjSM0kC7Mz5OG1RxNzmn+yZiHGLR+uiw2qD3S6U1uWBq6KArOt5IasImz+7fV520d Z6/SQkQ6BsxeNL3y//45UMZTTyEdo1k5XsBoq8h+hqjrKsm+g1wzkSzEzV0nVaA+bE snPZygFAomT3zFiJREgSYdjm6HEDTtgZt52/l88+sETD2p6DJhlx077XVG4uunPFEr N3WGWbJfucOexe8Al+0ojee7rGa/IY4aCAvl0oBzE8dPPSDCz9vgwUTacEaLu4NQVe Q4rraUhIj5meg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 04/20] x86: decompressor: Use standard calling convention for trampoline Date: Mon, 8 May 2023 09:03:14 +0200 Message-Id: <20230508070330.582131-5-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4077; i=ardb@kernel.org; h=from:subject; bh=rxt+/g9xwfTqH/LFwaurDzU1JjHkdOgxvezFD/2D/FU=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3rrId56lOhdm7XJ40vaGv11ZRUqVt7+iZc6EzVtX7 wk3XPKso5SFQYyDQVZMkUVg9t93O09PlKp1niULM4eVCWQIAxenAEzkw2ZGhtu9lQ3T7ZsZEqdG ciXOXKq4aY/LdpEln4WPXr+QcHynZRgjw7UFzb8EnYVnT5a9svr7/pnPPgibXt91IZDtsReX99x /m7gA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Update the trampoline code so its arguments are passed via RDI and RSI, which matches the ordinary SysV calling convention for x86_64. This will allow us to call this code directly from C. Signed-off-by: Ard Biesheuvel Acked-by: Kirill A. Shutemov --- arch/x86/boot/compressed/head_64.S | 30 +++++++++----------- arch/x86/boot/compressed/pgtable.h | 2 +- 2 files changed, 14 insertions(+), 18 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 81b53b576cdd05ae..b1f8a867777120bb 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -454,9 +454,9 @@ SYM_CODE_START(startup_64) movq %r15, %rdi /* pass struct boot_params pointer */ call paging_prepare - /* Save the trampoline address in RCX */ - movq %rax, %rcx - + /* Pass the trampoline address and boolean flag as args #1 and #2 */ + movq %rax, %rdi + movq %rdx, %rsi leaq TRAMPOLINE_32BIT_CODE_OFFSET(%rax), %rax call *%rax @@ -560,11 +560,11 @@ SYM_FUNC_END(.Lrelocated) /* * This is the 32-bit trampoline that will be copied over to low memory. * - * ECX contains the base address of the trampoline memory. - * Non zero RDX means trampoline needs to enable 5-level paging. + * EDI contains the base address of the trampoline memory. + * Non-zero ESI means trampoline needs to enable 5-level paging. */ SYM_CODE_START(trampoline_32bit_src) - popq %rdi + popq %r8 /* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */ pushq $__KERNEL32_CS leaq 0f(%rip), %rax @@ -578,7 +578,7 @@ SYM_CODE_START(trampoline_32bit_src) movl %eax, %ss /* Set up new stack */ - leal TRAMPOLINE_32BIT_STACK_END(%ecx), %esp + leal TRAMPOLINE_32BIT_STACK_END(%edi), %esp /* Disable paging */ movl %cr0, %eax @@ -586,7 +586,7 @@ SYM_CODE_START(trampoline_32bit_src) movl %eax, %cr0 /* Check what paging mode we want to be in after the trampoline */ - testl %edx, %edx + testl %esi, %esi jz 1f /* We want 5-level paging: don't touch CR3 if it already points to 5-level page tables */ @@ -601,21 +601,17 @@ SYM_CODE_START(trampoline_32bit_src) jz 3f 2: /* Point CR3 to the trampoline's new top level page table */ - leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%ecx), %eax + leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%edi), %eax movl %eax, %cr3 3: /* Set EFER.LME=1 as a precaution in case hypervsior pulls the rug */ - pushl %ecx - pushl %edx movl $MSR_EFER, %ecx rdmsr btsl $_EFER_LME, %eax /* Avoid writing EFER if no change was made (for TDX guest) */ jc 1f wrmsr -1: popl %edx - popl %ecx - +1: #ifdef CONFIG_X86_MCE /* * Preserve CR4.MCE if the kernel will enable #MC support. @@ -632,14 +628,14 @@ SYM_CODE_START(trampoline_32bit_src) /* Enable PAE and LA57 (if required) paging modes */ orl $X86_CR4_PAE, %eax - testl %edx, %edx + testl %esi, %esi jz 1f orl $X86_CR4_LA57, %eax 1: movl %eax, %cr4 /* Calculate address of paging_enabled() once we are executing in the trampoline */ - leal .Lpaging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFFSET(%ecx), %eax + leal .Lpaging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFFSET(%edi), %eax /* Prepare the stack for far return to Long Mode */ pushl $__KERNEL_CS @@ -656,7 +652,7 @@ SYM_CODE_END(trampoline_32bit_src) .code64 SYM_FUNC_START_LOCAL_NOALIGN(.Lpaging_enabled) /* Return from the trampoline */ - jmp *%rdi + jmp *%r8 SYM_FUNC_END(.Lpaging_enabled) /* diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h index 91dbb99203fbce2d..4e8cef135226bcbb 100644 --- a/arch/x86/boot/compressed/pgtable.h +++ b/arch/x86/boot/compressed/pgtable.h @@ -14,7 +14,7 @@ extern unsigned long *trampoline_32bit; -extern void trampoline_32bit_src(void *return_ptr); +extern void trampoline_32bit_src(void *trampoline, bool enable_5lvl); #endif /* __ASSEMBLER__ */ #endif /* BOOT_COMPRESSED_PAGETABLE_H */ From patchwork Mon May 8 07:03:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36267C77B75 for ; Mon, 8 May 2023 07:04:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233053AbjEHHEb (ORCPT ); Mon, 8 May 2023 03:04:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232932AbjEHHEH (ORCPT ); Mon, 8 May 2023 03:04:07 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E2391A1F7; Mon, 8 May 2023 00:04:02 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E3B9661F96; Mon, 8 May 2023 07:04:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92676C433A4; Mon, 8 May 2023 07:03:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529441; bh=Qz3uNSnHMPU7hlGyuEpJliraqb/Fez39IiSe2mqtMoo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MidC1/dw8hHKhTnSTurzN66Te3FasH9dv9RRbVjhDmi0cPle2+vUWXABnMG8TutuI kl2MPAtEHubbYhLNCldB94CYlHTuihuJWGMG04rNMqoCipCfqXtCLaXdvJ7GL9V8Cl QMrIt0id9mQ7E4QSTM0PDGqubGeVft/4yrBKLamt2Hi8NKVNchbBlEL6By+BlDB9Wq v5K7sGrnwEgIRx5Hf70Imlftmym5t9dt27h3aZTO0oD9BDjrRKTayITO8791n8Oply G6niEyDO9GukZN8YfRPEhCYNqc5mh53pYl/motJn32lFL+WXb1UxJoWcQJpJ4G0Mzs WT2s0eGeSh0/w== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 05/20] x86: decompressor: Avoid the need for a stack in the 32-bit trampoline Date: Mon, 8 May 2023 09:03:15 +0200 Message-Id: <20230508070330.582131-6-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5630; i=ardb@kernel.org; h=from:subject; bh=Qz3uNSnHMPU7hlGyuEpJliraqb/Fez39IiSe2mqtMoo=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3nrewOe799/icPx6dEn/3t+fZXfEZZbuNP/mqf1R5 d7DxJo3HaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAi2fsYGU7qp9znf5r0+ZaT xvXTqou0FX/ktK4Q5uYWc5jjvNPyvDDDf6/OxSZvDJ7VMeS/eZqS5euTN2G9Js+OHCst7V1d6nN VGQA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The 32-bit trampoline no longer uses the stack for anything except performing a long return back to long mode. Currently, this stack is allocated in the same page that carries the trampoline code, which means this page must be mapped writable and executable, and the stack is therefore executable as well. So let's do a long jump instead: that way, we can pre-calculate the return address and poke it into the code before we call it. In a later patch, we will take advantage of this by removing writable permissions (and adding executable ones) explicitly when booting via the EFI stub. Not playing with the stack pointer also makes it more straight-forward to call the trampoline code as an ordinary 64-bit function from C code. Signed-off-by: Ard Biesheuvel Acked-by: Kirill A. Shutemov --- arch/x86/boot/compressed/head_64.S | 34 ++++---------------- arch/x86/boot/compressed/pgtable.h | 6 ++-- arch/x86/boot/compressed/pgtable_64.c | 12 ++++++- 3 files changed, 21 insertions(+), 31 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index b1f8a867777120bb..3b5fc851737ffc39 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -460,9 +460,6 @@ SYM_CODE_START(startup_64) leaq TRAMPOLINE_32BIT_CODE_OFFSET(%rax), %rax call *%rax - /* Restore the stack, the 32-bit trampoline uses its own stack */ - leaq rva(boot_stack_end)(%rbx), %rsp - /* * cleanup_trampoline() would restore trampoline memory. * @@ -563,24 +560,17 @@ SYM_FUNC_END(.Lrelocated) * EDI contains the base address of the trampoline memory. * Non-zero ESI means trampoline needs to enable 5-level paging. */ + .section ".rodata", "a", @progbits SYM_CODE_START(trampoline_32bit_src) - popq %r8 /* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */ pushq $__KERNEL32_CS leaq 0f(%rip), %rax pushq %rax lretq +.Lret: retq .code32 -0: /* Set up data and stack segments */ - movl $__KERNEL_DS, %eax - movl %eax, %ds - movl %eax, %ss - - /* Set up new stack */ - leal TRAMPOLINE_32BIT_STACK_END(%edi), %esp - - /* Disable paging */ +0: /* Disable paging */ movl %cr0, %eax btrl $X86_CR0_PG_BIT, %eax movl %eax, %cr0 @@ -634,26 +624,16 @@ SYM_CODE_START(trampoline_32bit_src) 1: movl %eax, %cr4 - /* Calculate address of paging_enabled() once we are executing in the trampoline */ - leal .Lpaging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFFSET(%edi), %eax - - /* Prepare the stack for far return to Long Mode */ - pushl $__KERNEL_CS - pushl %eax - /* Enable paging again. */ movl %cr0, %eax btsl $X86_CR0_PG_BIT, %eax movl %eax, %cr0 - lret +.Ljmp: ljmpl $__KERNEL_CS, $(.Lret - trampoline_32bit_src) SYM_CODE_END(trampoline_32bit_src) - .code64 -SYM_FUNC_START_LOCAL_NOALIGN(.Lpaging_enabled) - /* Return from the trampoline */ - jmp *%r8 -SYM_FUNC_END(.Lpaging_enabled) +/* keep this right after trampoline_32bit_src() so we can infer its size */ +SYM_DATA(trampoline_ljmp_imm_offset, .word .Ljmp + 1 - trampoline_32bit_src) /* * The trampoline code has a size limit. @@ -662,7 +642,7 @@ SYM_FUNC_END(.Lpaging_enabled) */ .org trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE - .code32 + .text SYM_FUNC_START_LOCAL_NOALIGN(.Lno_longmode) /* This isn't an x86-64 CPU, so hang intentionally, we cannot continue */ 1: diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h index 4e8cef135226bcbb..131488f50af55d0a 100644 --- a/arch/x86/boot/compressed/pgtable.h +++ b/arch/x86/boot/compressed/pgtable.h @@ -6,9 +6,7 @@ #define TRAMPOLINE_32BIT_PGTABLE_OFFSET 0 #define TRAMPOLINE_32BIT_CODE_OFFSET PAGE_SIZE -#define TRAMPOLINE_32BIT_CODE_SIZE 0xA0 - -#define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE +#define TRAMPOLINE_32BIT_CODE_SIZE 0x80 #ifndef __ASSEMBLER__ @@ -16,5 +14,7 @@ extern unsigned long *trampoline_32bit; extern void trampoline_32bit_src(void *trampoline, bool enable_5lvl); +extern const u16 trampoline_ljmp_imm_offset; + #endif /* __ASSEMBLER__ */ #endif /* BOOT_COMPRESSED_PAGETABLE_H */ diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index 2ac12ff4111bf8c0..09fc18180929fab3 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -109,6 +109,7 @@ static unsigned long find_trampoline_placement(void) struct paging_config paging_prepare(void *rmode) { struct paging_config paging_config = {}; + void *tramp_code; /* Initialize boot_params. Required for cmdline_find_option_bool(). */ boot_params = rmode; @@ -143,9 +144,18 @@ struct paging_config paging_prepare(void *rmode) memset(trampoline_32bit, 0, TRAMPOLINE_32BIT_SIZE); /* Copy trampoline code in place */ - memcpy(trampoline_32bit + TRAMPOLINE_32BIT_CODE_OFFSET / sizeof(unsigned long), + tramp_code = memcpy(trampoline_32bit + + TRAMPOLINE_32BIT_CODE_OFFSET / sizeof(unsigned long), &trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE); + /* + * Avoid the need for a stack in the 32-bit trampoline code, by using + * LJMP rather than LRET to return back to long mode. LJMP takes an + * immediate absolute address, so we have to adjust that based on the + * placement of the trampoline. + */ + *(u32 *)(tramp_code + trampoline_ljmp_imm_offset) += (unsigned long)tramp_code; + /* * The code below prepares page table in trampoline memory. * From patchwork Mon May 8 07:03:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 958A6C77B75 for ; Mon, 8 May 2023 07:04:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233155AbjEHHEo (ORCPT ); Mon, 8 May 2023 03:04:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232525AbjEHHES (ORCPT ); Mon, 8 May 2023 03:04:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE94F1A488; Mon, 8 May 2023 00:04:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C9EA61F91; Mon, 8 May 2023 07:04:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5DB8C433EF; Mon, 8 May 2023 07:04:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529445; bh=mrjWwdSAODJDfTtOLh6fqLIu64cazT+ON+Va8cln/ts=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fne6sJwP7Fa3UrJyxIRjBnFksjaMKy45oFVwdGJHZQxT/3KkC29EolWOhYlvQ7zmr Ld+ZOKRuwxo/FP1vLL/tB9S477YMHrZqik6MUeexfTiobGymP7/oseDxhKwTudEMs0 3aBt5oBUBbKfJ/51BU2egy3Nxjhsk9v7IcDIHtQ8u5z/IjCksLeeYM3sEVk7fwyyUt mtRCgzxask/FGBKEh+AtzuHnnG1E4/O8AUtsA8xQAHLcmkzbrfgFsfW7FMlsh14KKc OVk9NR/pHJHAojRossv0wD5O8/dUZG4NGTUoi5xdXQnmfKoW1q9XewxB7bg1NcXvnX vagaT/lVihvTA== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 06/20] x86: decompressor: Call trampoline directly from C code Date: Mon, 8 May 2023 09:03:16 +0200 Message-Id: <20230508070330.582131-7-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4896; i=ardb@kernel.org; h=from:subject; bh=mrjWwdSAODJDfTtOLh6fqLIu64cazT+ON+Va8cln/ts=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3oa9t+3vzf8/6dsevtYP8vuuVfPMm9lScaDIpuZos HaKfZt+RykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZjI1iOMDBO/N81lFLrzbaFj 6+q0PVrrnI9e5r626Yb4u+YN5UfeLLNh+Ct8Z+ON/j+20/36BBT8vmlsWn9CVPlH85bNMzOKz71 /xMQEAA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Instead of returning to the asm calling code to invoke the trampoline, call it straight from the C code that sets the scene. That way, we don't need the struct return type for returning two values, and we can make the call conditional more cleanly in a subsequent patch. Signed-off-by: Ard Biesheuvel Acked-by: Kirill A. Shutemov --- arch/x86/boot/compressed/head_64.S | 20 +++----------- arch/x86/boot/compressed/pgtable_64.c | 28 ++++++++------------ 2 files changed, 15 insertions(+), 33 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 3b5fc851737ffc39..94b614ecb7c2fd55 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -441,24 +441,12 @@ SYM_CODE_START(startup_64) #endif /* - * paging_prepare() sets up the trampoline and checks if we need to - * enable 5-level paging. - * - * paging_prepare() returns a two-quadword structure which lands - * into RDX:RAX: - * - Address of the trampoline is returned in RAX. - * - Non zero RDX means trampoline needs to enable 5-level - * paging. - * + * set_paging_levels() updates the number of paging levels using a + * trampoline in 32-bit addressable memory if the current number does + * not match the desired number. */ movq %r15, %rdi /* pass struct boot_params pointer */ - call paging_prepare - - /* Pass the trampoline address and boolean flag as args #1 and #2 */ - movq %rax, %rdi - movq %rdx, %rsi - leaq TRAMPOLINE_32BIT_CODE_OFFSET(%rax), %rax - call *%rax + call set_paging_levels /* * cleanup_trampoline() would restore trampoline memory. diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index 09fc18180929fab3..b62b6819dcdd01be 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -16,11 +16,6 @@ unsigned int __section(".data") pgdir_shift = 39; unsigned int __section(".data") ptrs_per_p4d = 1; #endif -struct paging_config { - unsigned long trampoline_start; - unsigned long l5_required; -}; - /* Buffer to preserve trampoline memory */ static char trampoline_save[TRAMPOLINE_32BIT_SIZE]; @@ -106,10 +101,10 @@ static unsigned long find_trampoline_placement(void) return bios_start - TRAMPOLINE_32BIT_SIZE; } -struct paging_config paging_prepare(void *rmode) +asmlinkage void set_paging_levels(void *rmode) { - struct paging_config paging_config = {}; - void *tramp_code; + void (*toggle_la57)(void *trampoline, bool enable_5lvl); + bool l5_required = false; /* Initialize boot_params. Required for cmdline_find_option_bool(). */ boot_params = rmode; @@ -130,12 +125,10 @@ struct paging_config paging_prepare(void *rmode) !cmdline_find_option_bool("no5lvl") && native_cpuid_eax(0) >= 7 && (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) { - paging_config.l5_required = 1; + l5_required = true; } - paging_config.trampoline_start = find_trampoline_placement(); - - trampoline_32bit = (unsigned long *)paging_config.trampoline_start; + trampoline_32bit = (unsigned long *)find_trampoline_placement(); /* Preserve trampoline memory */ memcpy(trampoline_save, trampoline_32bit, TRAMPOLINE_32BIT_SIZE); @@ -144,7 +137,7 @@ struct paging_config paging_prepare(void *rmode) memset(trampoline_32bit, 0, TRAMPOLINE_32BIT_SIZE); /* Copy trampoline code in place */ - tramp_code = memcpy(trampoline_32bit + + toggle_la57 = memcpy(trampoline_32bit + TRAMPOLINE_32BIT_CODE_OFFSET / sizeof(unsigned long), &trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE); @@ -154,7 +147,8 @@ struct paging_config paging_prepare(void *rmode) * immediate absolute address, so we have to adjust that based on the * placement of the trampoline. */ - *(u32 *)(tramp_code + trampoline_ljmp_imm_offset) += (unsigned long)tramp_code; + *(u32 *)((u8 *)toggle_la57 + trampoline_ljmp_imm_offset) += + (unsigned long)toggle_la57; /* * The code below prepares page table in trampoline memory. @@ -170,10 +164,10 @@ struct paging_config paging_prepare(void *rmode) * We are not going to use the page table in trampoline memory if we * are already in the desired paging mode. */ - if (paging_config.l5_required == !!(native_read_cr4() & X86_CR4_LA57)) + if (l5_required == !!(native_read_cr4() & X86_CR4_LA57)) goto out; - if (paging_config.l5_required) { + if (l5_required) { /* * For 4- to 5-level paging transition, set up current CR3 as * the first and the only entry in a new top-level page table. @@ -196,7 +190,7 @@ struct paging_config paging_prepare(void *rmode) } out: - return paging_config; + toggle_la57(trampoline_32bit, l5_required); } void cleanup_trampoline(void *pgtable) From patchwork Mon May 8 07:03:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 681ECC77B75 for ; Mon, 8 May 2023 07:05:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233375AbjEHHFK (ORCPT ); Mon, 8 May 2023 03:05:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232699AbjEHHEg (ORCPT ); Mon, 8 May 2023 03:04:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77B731C0D0; Mon, 8 May 2023 00:04:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6315561F93; Mon, 8 May 2023 07:04:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0A21BC4339E; Mon, 8 May 2023 07:04:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529449; bh=B6tcBjxJdP5tSUcFhPzsS1qpmmFSeFD0AgivQH2wFw4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UGkYQ1dcY9vkCEigwIpAP4SetJp5ouS4HB3kkMNCHaF3WR4OwzALHjucnzLgwFYFH l5Q1OwJP6nXd5Nc5t3s9Sk75H21VD5DLeU42mXS8oXfHHlMnvGP9UGtCyfTJbOh8pF xrJ4pI+X56MxLkHdjXe5TIscFo+IbPV0ZyEUJJK+vKygWyK+nAgp8u8JMDG7Z6QbQs FK1fd1G7ceMnNa4oFSJWxieTRPPTnDUplGD/3HkDvzABz2hHekcAxuNSX1DuXbSMhR jeQTHilekptQEB8ANd2fLJ+VBubZXWPsZ9u+7glYLY2ahMl8tYPgjNSO1pkDY+xc8T ltnDOgtqyDXEA== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 07/20] x86: decompressor: Only call the trampoline when changing paging levels Date: Mon, 8 May 2023 09:03:17 +0200 Message-Id: <20230508070330.582131-8-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3661; i=ardb@kernel.org; h=from:subject; bh=B6tcBjxJdP5tSUcFhPzsS1qpmmFSeFD0AgivQH2wFw4=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3qZDmsqVvm9Z8pJSJ76a/7ZgloXwE/X/mYcZBBgOX y8VF9nUUcrCIMbBICumyCIw+++7nacnStU6z5KFmcPKBDKEgYtTACZyqJ6RYeIx731OO+41O8g8 5Pqq3b5kSUm7aErMdaF3TKt3Ok88MI+R4cyD9ZJHzi/fv9eKWVFl2/HPCYmfWDY0Miw+KceV/Hz vT1YA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Since we know the current and desired number of paging levels when preparing the trampoline, let's not call the trampoline at all when we know that calling it is not going to result in a change to the number of paging levels. Given that we are already running in long mode, the PAE and LA57 settings are necessarily consistent with the currently active page tables - the only difference is that we will always preserve CR4.MCE in this case, but this will be cleared by the real kernel startup code if CONFIG_X86_MCE is not enabled. Signed-off-by: Ard Biesheuvel Acked-by: Kirill A. Shutemov --- arch/x86/boot/compressed/head_64.S | 21 +------------------- arch/x86/boot/compressed/pgtable_64.c | 18 +++++++---------- 2 files changed, 8 insertions(+), 31 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 94b614ecb7c2fd55..ccdfe7e55c36a40f 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -398,10 +398,6 @@ SYM_CODE_START(startup_64) * For the trampoline, we need the top page table to reside in lower * memory as we don't have a way to load 64-bit values into CR3 in * 32-bit mode. - * - * We go though the trampoline even if we don't have to: if we're - * already in a desired paging mode. This way the trampoline code gets - * tested on every boot. */ /* Make sure we have GDT with 32-bit code segment */ @@ -563,25 +559,10 @@ SYM_CODE_START(trampoline_32bit_src) btrl $X86_CR0_PG_BIT, %eax movl %eax, %cr0 - /* Check what paging mode we want to be in after the trampoline */ - testl %esi, %esi - jz 1f - - /* We want 5-level paging: don't touch CR3 if it already points to 5-level page tables */ - movl %cr4, %eax - testl $X86_CR4_LA57, %eax - jnz 3f - jmp 2f -1: - /* We want 4-level paging: don't touch CR3 if it already points to 4-level page tables */ - movl %cr4, %eax - testl $X86_CR4_LA57, %eax - jz 3f -2: /* Point CR3 to the trampoline's new top level page table */ leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%edi), %eax movl %eax, %cr3 -3: + /* Set EFER.LME=1 as a precaution in case hypervsior pulls the rug */ movl $MSR_EFER, %ecx rdmsr diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index b62b6819dcdd01be..b92cf1d6e156d5f6 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -128,6 +128,13 @@ asmlinkage void set_paging_levels(void *rmode) l5_required = true; } + /* + * We are not going to use the trampoline if we + * are already in the desired paging mode. + */ + if (l5_required == !!(native_read_cr4() & X86_CR4_LA57)) + return; + trampoline_32bit = (unsigned long *)find_trampoline_placement(); /* Preserve trampoline memory */ @@ -155,18 +162,8 @@ asmlinkage void set_paging_levels(void *rmode) * * The new page table will be used by trampoline code for switching * from 4- to 5-level paging or vice versa. - * - * If switching is not required, the page table is unused: trampoline - * code wouldn't touch CR3. */ - /* - * We are not going to use the page table in trampoline memory if we - * are already in the desired paging mode. - */ - if (l5_required == !!(native_read_cr4() & X86_CR4_LA57)) - goto out; - if (l5_required) { /* * For 4- to 5-level paging transition, set up current CR3 as @@ -189,7 +186,6 @@ asmlinkage void set_paging_levels(void *rmode) (void *)src, PAGE_SIZE); } -out: toggle_la57(trampoline_32bit, l5_required); } From patchwork Mon May 8 07:03:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41960C77B73 for ; Mon, 8 May 2023 07:05:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232932AbjEHHFL (ORCPT ); Mon, 8 May 2023 03:05:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233143AbjEHHEo (ORCPT ); Mon, 8 May 2023 03:04:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C23E1C0F7; Mon, 8 May 2023 00:04:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9322861F96; Mon, 8 May 2023 07:04:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4069FC433D2; Mon, 8 May 2023 07:04:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529454; bh=Cbgin3O3w5TrAUQhn3/olaZs0A4XgRk35uCUUkZRmeM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ENwWEqdCtVzf0zk6FhW1aJ99HkXt7BD5OjSafuLC9oHhpagtOXKtTghetJRs30NXs AH4NNjgyAufSkCxPa/060OQepjjrZSpiPmY2UWaOfGj73FAiHYvN8i/MhyHnJ0hsyV 5R+6Ocy4N1dbrLMbcPmBYAlCJ3yxdwytYS6+o5CGLbvuEQ3p943QppwiEetM6cRR2O yLuyDI9g+e+tsK57ufIJd1hWztDtOHC3NIVG+53/jw6ix+smS//QaX//nx5rX3TYrD 7c7oJKdB0e1BIz6jo9xPVmoiiJ6YRreRNoaZy6yUC7eiZYw5sPT1bRtc37BPunbBNm rZS7DICb36CbQ== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 08/20] x86: decompressor: Merge trampoline cleanup with switching code Date: Mon, 8 May 2023 09:03:18 +0200 Message-Id: <20230508070330.582131-9-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4104; i=ardb@kernel.org; h=from:subject; bh=Cbgin3O3w5TrAUQhn3/olaZs0A4XgRk35uCUUkZRmeM=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3ubiDVq+31j5S78ITuP8/KjtsUHPLWmfkgnW5QLuf Gq7TrB2lLIwiHEwyIopsgjM/vtu5+mJUrXOs2Rh5rAygQxh4OIUgIk8fsPwP7KoU8HwhMCMeQ+C fzKFZn3n2HKkWmNxvsSrpvwdfy8vv8bIcLujhnG+p/X94lkrLuSnqF5lnn58WYJA1BKtpSdyv99 R5AcA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Now that the trampoline setup code and the actual invocation of it are all done from the C routine, we can merge the trampoline cleanup into that as well, instead of returning to asm just to call another C function. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_64.S | 13 +++------ arch/x86/boot/compressed/pgtable_64.c | 28 ++++++++------------ 2 files changed, 15 insertions(+), 26 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index ccdfe7e55c36a40f..21af9cfd0dd0afb7 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -440,19 +440,14 @@ SYM_CODE_START(startup_64) * set_paging_levels() updates the number of paging levels using a * trampoline in 32-bit addressable memory if the current number does * not match the desired number. + * + * RSI is the relocated address of the page table to use instead of + * page table in trampoline memory (if required). */ movq %r15, %rdi /* pass struct boot_params pointer */ + leaq rva(top_pgtable)(%rbx), %rsi call set_paging_levels - /* - * cleanup_trampoline() would restore trampoline memory. - * - * RDI is address of the page table to use instead of page table - * in trampoline memory (if required). - */ - leaq rva(top_pgtable)(%rbx), %rdi - call cleanup_trampoline - /* Zero EFLAGS */ pushq $0 popfq diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index b92cf1d6e156d5f6..eeddad8c8335655e 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -101,9 +101,10 @@ static unsigned long find_trampoline_placement(void) return bios_start - TRAMPOLINE_32BIT_SIZE; } -asmlinkage void set_paging_levels(void *rmode) +asmlinkage void set_paging_levels(void *rmode, void *pgtable) { void (*toggle_la57)(void *trampoline, bool enable_5lvl); + void *trampoline_pgtable; bool l5_required = false; /* Initialize boot_params. Required for cmdline_find_option_bool(). */ @@ -133,7 +134,7 @@ asmlinkage void set_paging_levels(void *rmode) * are already in the desired paging mode. */ if (l5_required == !!(native_read_cr4() & X86_CR4_LA57)) - return; + goto out; trampoline_32bit = (unsigned long *)find_trampoline_placement(); @@ -163,6 +164,8 @@ asmlinkage void set_paging_levels(void *rmode) * The new page table will be used by trampoline code for switching * from 4- to 5-level paging or vice versa. */ + trampoline_pgtable = trampoline_32bit + + TRAMPOLINE_32BIT_PGTABLE_OFFSET / sizeof(unsigned long); if (l5_required) { /* @@ -182,31 +185,21 @@ asmlinkage void set_paging_levels(void *rmode) * may be above 4G. */ src = *(unsigned long *)__native_read_cr3() & PAGE_MASK; - memcpy(trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET / sizeof(unsigned long), - (void *)src, PAGE_SIZE); + memcpy(trampoline_pgtable, (void *)src, PAGE_SIZE); } toggle_la57(trampoline_32bit, l5_required); -} - -void cleanup_trampoline(void *pgtable) -{ - void *trampoline_pgtable; - - trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET / sizeof(unsigned long); /* - * Move the top level page table out of trampoline memory, - * if it's there. + * Move the top level page table out of trampoline memory. */ - if ((void *)__native_read_cr3() == trampoline_pgtable) { - memcpy(pgtable, trampoline_pgtable, PAGE_SIZE); - native_write_cr3((unsigned long)pgtable); - } + memcpy(pgtable, trampoline_pgtable, PAGE_SIZE); + native_write_cr3((unsigned long)pgtable); /* Restore trampoline memory */ memcpy(trampoline_32bit, trampoline_save, TRAMPOLINE_32BIT_SIZE); +out: /* Initialize variables for 5-level paging */ #ifdef CONFIG_X86_5LEVEL if (__read_cr4() & X86_CR4_LA57) { @@ -215,4 +208,5 @@ void cleanup_trampoline(void *pgtable) ptrs_per_p4d = 512; } #endif + return; } From patchwork Mon May 8 07:03:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B6D9C7EE24 for ; Mon, 8 May 2023 07:05:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232849AbjEHHFO (ORCPT ); Mon, 8 May 2023 03:05:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233052AbjEHHEv (ORCPT ); Mon, 8 May 2023 03:04:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 889051D94E; Mon, 8 May 2023 00:04:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C655461F8E; Mon, 8 May 2023 07:04:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7498EC4339E; Mon, 8 May 2023 07:04:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529458; bh=VPdQcfiofp2M7ooAjtAnUJeEgqKJFWTMviAMCuGGVF8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MhaxFeLau1dvafm0jJWXvK9QInCA5eHAWaLkKGbSAm0nGGNSJfN6L3Aoh99QiU3/8 5xv9E+OsWiQZBb1kZ4i+6MM97pixg/MWY+9QXaiMy39OrDYM/+zdLUA+xMIURXIPC/ 8/8haFD82HWJZWBgezrr8SlNS8dzS5uzfWc9piswHlWmBpavC8w58fXq06Pl9XmthC /m6DVD7M7rJPWjHY3WT4o0BrawddXUQeIdW1WGZv8FhntZXoXWQ3J0g7llilDlESkv cpkfkw01zyU4Ulcyiw360eCewja2F9/tgwG3SvrrRd+aOARjuqOsBw0p1TvyGAPiTl ktBibFkXR8K3A== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 09/20] x86: efistub: Perform 4/5 level paging switch from the stub Date: Mon, 8 May 2023 09:03:19 +0200 Message-Id: <20230508070330.582131-10-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6966; i=ardb@kernel.org; h=from:subject; bh=VPdQcfiofp2M7ooAjtAnUJeEgqKJFWTMviAMCuGGVF8=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3paEC0fE3/dpyz7xe7/L/OB1Pr1zit/PP95RNvWsV m39DM7tHaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAiX4IZGR68vmW9yvPXEqeo k7OqX1VN0772ZJccu/ChiqndXz7d7TjO8JNxCmcA8/dT899pWySpTt9zPnLPtztKxx7d2vplrd7 Lik4+AA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org In preparation for updating the EFI stub boot flow to avoid the bare metal decompressor code altogether, implement the support code for switching between 4 and 5 levels of paging before jumping to the kernel proper. This reuses the newly refactored trampoline that the bare metal decompressor uses, but relies on EFI APIs to allocate 32-bit addressable memory and remap it with the appropriate permissions. Given that the bare metal decompressor will no longer call into the trampoline if the number of paging levels is already set correctly, we no longer need to remove NX restrictions from the memory range where this trampoline may end up. Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/efi-stub-helper.c | 4 + drivers/firmware/efi/libstub/x86-stub.c | 119 ++++++++++++++++---- 2 files changed, 102 insertions(+), 21 deletions(-) diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c index 1e0203d74691ffcc..fc5f3b4c45e91401 100644 --- a/drivers/firmware/efi/libstub/efi-stub-helper.c +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c @@ -16,6 +16,8 @@ #include "efistub.h" +extern bool efi_no5lvl; + bool efi_nochunk; bool efi_nokaslr = !IS_ENABLED(CONFIG_RANDOMIZE_BASE); bool efi_novamap; @@ -73,6 +75,8 @@ efi_status_t efi_parse_options(char const *cmdline) efi_loglevel = CONSOLE_LOGLEVEL_QUIET; } else if (!strcmp(param, "noinitrd")) { efi_noinitrd = true; + } else if (IS_ENABLED(CONFIG_X86_64) && !strcmp(param, "no5lvl")) { + efi_no5lvl = true; } else if (!strcmp(param, "efi") && val) { efi_nochunk = parse_option_str(val, "nochunk"); efi_novamap |= parse_option_str(val, "novamap"); diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index a0bfd31358ba97b1..fb83a72ad905ad6e 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -267,32 +267,11 @@ adjust_memory_range_protection(unsigned long start, unsigned long size) } } -/* - * Trampoline takes 2 pages and can be loaded in first megabyte of memory - * with its end placed between 128k and 640k where BIOS might start. - * (see arch/x86/boot/compressed/pgtable_64.c) - * - * We cannot find exact trampoline placement since memory map - * can be modified by UEFI, and it can alter the computed address. - */ - -#define TRAMPOLINE_PLACEMENT_BASE ((128 - 8)*1024) -#define TRAMPOLINE_PLACEMENT_SIZE (640*1024 - (128 - 8)*1024) - void startup_32(struct boot_params *boot_params); static void setup_memory_protection(unsigned long image_base, unsigned long image_size) { - /* - * Allow execution of possible trampoline used - * for switching between 4- and 5-level page tables - * and relocated kernel image. - */ - - adjust_memory_range_protection(TRAMPOLINE_PLACEMENT_BASE, - TRAMPOLINE_PLACEMENT_SIZE); - #ifdef CONFIG_64BIT if (image_base != (unsigned long)startup_32) adjust_memory_range_protection(image_base, image_size); @@ -760,6 +739,96 @@ static efi_status_t exit_boot(struct boot_params *boot_params, void *handle) return EFI_SUCCESS; } +bool efi_no5lvl; + +static void (*la57_toggle)(void *trampoline, bool enable_5lvl); + +extern void trampoline_32bit_src(void *, bool); +extern const u16 trampoline_ljmp_imm_offset; + +/* + * Enabling (or disabling) 5 level paging is tricky, because it can only be + * done from 32-bit mode with paging disabled. This means not only that the + * code itself must be running from 32-bit addressable physical memory, but + * also that the root page table must be 32-bit addressable, as we cannot + * program a 64-bit value into CR3 when running in 32-bit mode. + */ +static efi_status_t efi_setup_5level_paging(void) +{ + u8 tmpl_size = (u8 *)&trampoline_ljmp_imm_offset - (u8 *)&trampoline_32bit_src; + efi_status_t status; + u8 *la57_code; + + if (!efi_is_64bit()) + return EFI_SUCCESS; + + /* check for 5 level paging support */ + if (native_cpuid_eax(0) < 7 || + !(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) + return EFI_SUCCESS; + + /* allocate some 32-bit addressable memory for code and a page table */ + status = efi_allocate_pages(2 * PAGE_SIZE, (unsigned long *)&la57_code, + U32_MAX); + if (status != EFI_SUCCESS) + return status; + + la57_toggle = memcpy(la57_code, trampoline_32bit_src, tmpl_size); + memset(la57_code + tmpl_size, 0x90, PAGE_SIZE - tmpl_size); + + /* + * To avoid having to allocate a 32-bit addressable stack, we use a + * ljmp to switch back to long mode. However, this takes an absolute + * address, so we have to poke it in at runtime. + */ + *(u32 *)&la57_code[trampoline_ljmp_imm_offset] += (unsigned long)la57_code; + + adjust_memory_range_protection((unsigned long)la57_toggle, PAGE_SIZE); + + return EFI_SUCCESS; +} + +static void efi_5level_switch(void) +{ +#ifdef CONFIG_X86_64 + static const struct desc_struct gdt[] = { + [GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(0xc09b, 0, 0xfffff), + [GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(0xa09b, 0, 0xfffff), + }; + + bool want_la57 = IS_ENABLED(CONFIG_X86_5LEVEL) && !efi_no5lvl; + bool have_la57 = native_read_cr4() & X86_CR4_LA57; + bool need_toggle = want_la57 ^ have_la57; + u64 *pgt = (void *)la57_toggle + PAGE_SIZE; + u64 *cr3 = (u64 *)__native_read_cr3(); + u64 *new_cr3; + + if (!la57_toggle || !need_toggle) + return; + + if (!have_la57) { + /* + * We are going to enable 5 level paging, so we need to + * allocate a root level page from the 32-bit addressable + * physical region, and plug the existing hierarchy into it. + */ + new_cr3 = memset(pgt, 0, PAGE_SIZE); + new_cr3[0] = (u64)cr3 | _PAGE_TABLE_NOENC; + } else { + // take the new root table pointer from the current entry #0 + new_cr3 = (u64 *)(cr3[0] & PAGE_MASK); + + // copy the new root level table if it is not 32-bit addressable + if ((u64)new_cr3 > U32_MAX) + new_cr3 = memcpy(pgt, new_cr3, PAGE_SIZE); + } + + native_load_gdt(&(struct desc_ptr){ sizeof(gdt) - 1, (u64)gdt }); + + la57_toggle(new_cr3, want_la57); +#endif +} + /* * On success, we return the address of startup_32, which has potentially been * relocated by efi_relocate_kernel. @@ -787,6 +856,12 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, efi_dxe_table = NULL; } + status = efi_setup_5level_paging(); + if (status != EFI_SUCCESS) { + efi_err("efi_setup_5level_paging() failed!\n"); + goto fail; + } + /* * If the kernel isn't already loaded at a suitable address, * relocate it. @@ -905,6 +980,8 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, goto fail; } + efi_5level_switch(); + return bzimage_addr; fail: efi_err("efi_main() failed!\n"); From patchwork Mon May 8 07:03:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1CE8C77B75 for ; Mon, 8 May 2023 07:05:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233137AbjEHHF3 (ORCPT ); Mon, 8 May 2023 03:05:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233267AbjEHHFA (ORCPT ); Mon, 8 May 2023 03:05:00 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B7751E987; Mon, 8 May 2023 00:04:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 070AF61F88; Mon, 8 May 2023 07:04:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7EC3C433EF; Mon, 8 May 2023 07:04:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529462; bh=gkBM0ParWTIOIX3agXXl7pVKn11FNRtW92FgScbL5xQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OGG8NyGj5znQYhIMlMU/p9vUWyXyS9VMx4MiWr+opEEKHHjRiVI8o6LrXVPAVnAVX FoSgFssVRcFDFVLwuetKIcASm36Yx8RfkQTvFBz+ZKrVJ58IEssCcvPWpN0KQhHgOZ zK1t/b7Mz7G2SsTOcu9uVdhK6j8YUmYOlbzWDFmd0BiExgvhda7Zz34GCKCvSLy6cf FaiIaAjx2V2P20jMc96TfCv6zXg/Xe3wzFD0kuvxV5mt5sKGwyzG5BYmQTLxXTsZXP 1xtjUXoY/a6qSS8d3ilUafDmrs900lCvWrPCcUx1V+3Owl3h61WDHksKihb4z063n1 xamyCEzuiJuaQ== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 10/20] x86: efistub: Prefer EFI memory attributes protocol over DXE services Date: Mon, 8 May 2023 09:03:20 +0200 Message-Id: <20230508070330.582131-11-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3304; i=ardb@kernel.org; h=from:subject; bh=gkBM0ParWTIOIX3agXXl7pVKn11FNRtW92FgScbL5xQ=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3tYnz/gWzCoT49m797dZf8HuwijftT/58neYvHk2Z 9rNib47O0pZGMQ4GGTFFFkEZv99t/P0RKla51myMHNYmUCGMHBxCsBEzsxj+O/B+n2jWqTIzXYD keDz8btUplgYh8ffeOZzZu2zG66Z6yUZGf5uydbpeS4jPeGtbHlvp5uz/0njj1rfgpvZFiw9fED Ghg8A X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Currently, we rely on DXE services in some cases to clear non-execute restrictions from page allocations that need to be executable. This is dodgy, because DXE services are not specified by UEFI but by PI, and they are not intended for consumption by OS loaders. However, no alternative existed at the time. Now, there is a new UEFI protocol that should be used instead, so if it exists, prefer it over the DXE services calls. Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/x86-stub.c | 29 ++++++++++++++------ 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index fb83a72ad905ad6e..ce8434fce0c37982 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -25,6 +25,7 @@ const efi_system_table_t *efi_system_table; const efi_dxe_services_table_t *efi_dxe_table; u32 image_offset __section(".data"); static efi_loaded_image_t *image = NULL; +static efi_memory_attribute_protocol_t *memattr; static efi_status_t preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom) @@ -221,12 +222,18 @@ adjust_memory_range_protection(unsigned long start, unsigned long size) unsigned long rounded_start, rounded_end; unsigned long unprotect_start, unprotect_size; - if (efi_dxe_table == NULL) - return; - rounded_start = rounddown(start, EFI_PAGE_SIZE); rounded_end = roundup(start + size, EFI_PAGE_SIZE); + if (memattr != NULL) { + efi_call_proto(memattr, clear_memory_attributes, rounded_start, + rounded_end - rounded_start, EFI_MEMORY_XP); + return; + } + + if (efi_dxe_table == NULL) + return; + /* * Don't modify memory region attributes, they are * already suitable, to lower the possibility to @@ -838,6 +845,7 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, efi_system_table_t *sys_table_arg, struct boot_params *boot_params) { + efi_guid_t guid = EFI_MEMORY_ATTRIBUTE_PROTOCOL_GUID; unsigned long bzimage_addr = (unsigned long)startup_32; unsigned long buffer_start, buffer_end; struct setup_header *hdr = &boot_params->hdr; @@ -849,13 +857,18 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, if (efi_system_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) efi_exit(handle, EFI_INVALID_PARAMETER); - efi_dxe_table = get_efi_config_table(EFI_DXE_SERVICES_TABLE_GUID); - if (efi_dxe_table && - efi_dxe_table->hdr.signature != EFI_DXE_SERVICES_TABLE_SIGNATURE) { - efi_warn("Ignoring DXE services table: invalid signature\n"); - efi_dxe_table = NULL; + if (IS_ENABLED(CONFIG_EFI_DXE_MEM_ATTRIBUTES)) { + efi_dxe_table = get_efi_config_table(EFI_DXE_SERVICES_TABLE_GUID); + if (efi_dxe_table && + efi_dxe_table->hdr.signature != EFI_DXE_SERVICES_TABLE_SIGNATURE) { + efi_warn("Ignoring DXE services table: invalid signature\n"); + efi_dxe_table = NULL; + } } + /* grab the memory attributes protocol if it exists */ + efi_bs_call(locate_protocol, &guid, NULL, (void **)&memattr); + status = efi_setup_5level_paging(); if (status != EFI_SUCCESS) { efi_err("efi_setup_5level_paging() failed!\n"); From patchwork Mon May 8 07:03:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB307C7EE24 for ; Mon, 8 May 2023 07:05:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233206AbjEHHFc (ORCPT ); Mon, 8 May 2023 03:05:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233308AbjEHHFD (ORCPT ); Mon, 8 May 2023 03:05:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B11911A480; Mon, 8 May 2023 00:04:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4519861F82; Mon, 8 May 2023 07:04:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DC7FCC4339B; Mon, 8 May 2023 07:04:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529466; bh=TDhagb7/1BjVNI9fsBOdvguiTH1+pJKnlE7Km5AKGt0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AlLKTvGdAWpI/A5BYkRTmnyqkQA1abi7JpJ4m4Y2pKwEswrR5OqiqiWrbyds6PFKp blWfB6MzVhNaeNiQrRDzcQlwVWZPI0pFkpC9O4AAr7+fSCaZPjBJHbHcAyTZiprnz8 eQ5t3GZN8KGyC8iBuPh9/eZQqnOK6PDUALr8VrsZgpf0VekP6SqBVoUFTalLYTnW86 pMiNJkNSZW7mpMp8FzPc61EetiEWY8VaBgsh2YYyfxmvyyMILVTP3QbbJKCcpe5JJi fHgPCt3Oa6/jU99481FemscAHT3eAf3mdSDA+/9EWLF67rVvK1tyZ5u6nVIAjcUPid dmKOHLiuL4MOg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 11/20] decompress: Use 8 byte alignment Date: Mon, 8 May 2023 09:03:21 +0200 Message-Id: <20230508070330.582131-12-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=737; i=ardb@kernel.org; h=from:subject; bh=TDhagb7/1BjVNI9fsBOdvguiTH1+pJKnlE7Km5AKGt0=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3nbH4HPMVjN2iPOzhUuqy3Wvr7hRdEN69SXmiLqDH FP5A+50lLIwiHEwyIopsgjM/vtu5+mJUrXOs2Rh5rAygQxh4OIUgIlY/Wf4nz078WqjW6OU7pSr zvfyjJ0vl+d4tX9x1l9uzVEuX1mzk5Ghme+vaE7+jN2PMkqv8C/237HzsG3lq69LlI7vXPIt5tt RXgA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The ZSTD decompressor requires malloc() allocations to be 8 byte aligned, so ensure that this the case. Signed-off-by: Ard Biesheuvel --- include/linux/decompress/mm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/decompress/mm.h b/include/linux/decompress/mm.h index 9192986b1a731323..ac862422df158bef 100644 --- a/include/linux/decompress/mm.h +++ b/include/linux/decompress/mm.h @@ -48,7 +48,7 @@ MALLOC_VISIBLE void *malloc(int size) if (!malloc_ptr) malloc_ptr = free_mem_ptr; - malloc_ptr = (malloc_ptr + 3) & ~3; /* Align */ + malloc_ptr = (malloc_ptr + 7) & ~7; /* Align */ p = (void *)malloc_ptr; malloc_ptr += size; From patchwork Mon May 8 07:03:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1A48C77B73 for ; Mon, 8 May 2023 07:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233253AbjEHHFo (ORCPT ); Mon, 8 May 2023 03:05:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233364AbjEHHFJ (ORCPT ); Mon, 8 May 2023 03:05:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F5D71F48C; Mon, 8 May 2023 00:04:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 71A7761F8C; Mon, 8 May 2023 07:04:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1F6DEC433A1; Mon, 8 May 2023 07:04:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529470; bh=4yAB7qU2wBVgLTvHIHU6cIIZ4zGzroQGyA3YqSYrfmM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lmwammIdU7YC9lbQBSs3qyH0YJ63WWeoJAf7K+TgDHLan4Oe3DmD1oHuptUzWWc5U FwiCLsGsQqDkxOLtj33NzVFdUeyaFd+qtelkUENTiIXr8z7RWb8mYtqxOj2RwYseek 4HjUzLyZj1fTQ087NzlKaipHBUwuCBXCEWT3MWL8IxNRfU/TWpYNkofDjhlLzr2kG0 YW4w+Air4OSGcX+Ell3abGYv3h/EhUOgBXY1Jbh8Q7NEQUxgzRtYQLmQsPo7V7BO/A WFRKo16YjIZ7y9rbVMZeuuOagb1rm0iMPc4Lewvfl9YO/rTPXa6tpqvCxn315Eo3ON 269KlaI2Z8e0A== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 12/20] x86: decompressor: Move global symbol references to C code Date: Mon, 8 May 2023 09:03:22 +0200 Message-Id: <20230508070330.582131-13-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4975; i=ardb@kernel.org; h=from:subject; bh=4yAB7qU2wBVgLTvHIHU6cIIZ4zGzroQGyA3YqSYrfmM=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3o4Fr6aHe4veZr3r/3Jto5qFwOb9/z3TLx9xia3a3 PO967BVRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZjIlbsMf/iSG169XyMmGr3e Mv5yZ9tM/Yqscz93XHF/acJe0jvtfx0jw4cyze1xZQ4qYqrv1v7vfaE/a2bw6s0it5n97O8/dep /yQcA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org We no longer have to be cautious when referring to global variables in the position independent decompressor code, now that we use PIE codegen and assert in the linker script that no GOT entries exist (which would require adjustment for the actual runtime load address of the decompressor binary). This means we can simply refer to global variables directly, instead of passing them into C routines from asm code. Let's do so for the code that will be called directly from the EFI stub after a subsequent patch, and avoid the need to pass these quantities directly. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_32.S | 8 -------- arch/x86/boot/compressed/head_64.S | 8 +------- arch/x86/boot/compressed/misc.c | 16 +++++++++------- 3 files changed, 10 insertions(+), 22 deletions(-) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index 53cbee1e2a93efce..8ee8749007595fcc 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -179,13 +179,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) */ /* push arguments for extract_kernel: */ - pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */ pushl %ebp /* output address */ - pushl input_len@GOTOFF(%ebx) /* input_len */ - leal input_data@GOTOFF(%ebx), %eax - pushl %eax /* input_data */ - leal boot_heap@GOTOFF(%ebx), %eax - pushl %eax /* heap area */ pushl %esi /* real mode pointer */ call extract_kernel /* returns kernel entry point in %eax */ addl $24, %esp @@ -213,8 +207,6 @@ SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end) */ .bss .balign 4 -boot_heap: - .fill BOOT_HEAP_SIZE, 1, 0 boot_stack: .fill BOOT_STACK_SIZE, 1, 0 boot_stack_end: diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 21af9cfd0dd0afb7..bcf678aed81468e3 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -519,11 +519,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) * Do the extraction, and jump to the new kernel.. */ movq %r15, %rdi /* pass struct boot_params pointer */ - leaq boot_heap(%rip), %rsi /* malloc area for uncompression */ - leaq input_data(%rip), %rdx /* input_data */ - movl input_len(%rip), %ecx /* input_len */ - movq %rbp, %r8 /* output target address */ - movl output_len(%rip), %r9d /* decompressed length, end of relocs */ + movq %rbp, %rsi /* output target address */ call extract_kernel /* returns kernel entry point in %rax */ /* @@ -651,8 +647,6 @@ SYM_DATA_END_LABEL(boot_idt, SYM_L_GLOBAL, boot_idt_end) */ .bss .balign 4 -SYM_DATA_LOCAL(boot_heap, .fill BOOT_HEAP_SIZE, 1, 0) - SYM_DATA_START_LOCAL(boot_stack) .fill BOOT_STACK_SIZE, 1, 0 .balign 16 diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 014ff222bf4b35c2..1cd40cb9fb4e5027 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -330,6 +330,11 @@ static size_t parse_elf(void *output) return ehdr.e_entry - LOAD_PHYSICAL_ADDR; } +static u8 boot_heap[BOOT_HEAP_SIZE] __aligned(4); + +extern unsigned char input_data[]; +extern unsigned int input_len, output_len; + /* * The compressed kernel image (ZO), has been moved so that its position * is against the end of the buffer used to hold the uncompressed kernel @@ -347,14 +352,11 @@ static size_t parse_elf(void *output) * |-------uncompressed kernel image---------| * */ -asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, - unsigned char *input_data, - unsigned long input_len, - unsigned char *output, - unsigned long output_len) +asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output) { const unsigned long kernel_total_size = VO__end - VO__text; unsigned long virt_addr = LOAD_PHYSICAL_ADDR; + memptr heap = (memptr)boot_heap; unsigned long needed_size; size_t entry_offset; @@ -412,7 +414,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, * entries. This ensures the full mapped area is usable RAM * and doesn't include any reserved areas. */ - needed_size = max(output_len, kernel_total_size); + needed_size = max((unsigned long)output_len, kernel_total_size); #ifdef CONFIG_X86_64 needed_size = ALIGN(needed_size, MIN_KERNEL_ALIGN); #endif @@ -443,7 +445,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, #ifdef CONFIG_X86_64 if (heap > 0x3fffffffffffUL) error("Destination address too large"); - if (virt_addr + max(output_len, kernel_total_size) > KERNEL_IMAGE_SIZE) + if (virt_addr + needed_size > KERNEL_IMAGE_SIZE) error("Destination virtual address is beyond the kernel mapping area"); #else if (heap > ((-__PAGE_OFFSET-(128<<20)-1) & 0x7fffffff)) From patchwork Mon May 8 07:03:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4268C7EE24 for ; Mon, 8 May 2023 07:05:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232943AbjEHHFq (ORCPT ); Mon, 8 May 2023 03:05:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233378AbjEHHFK (ORCPT ); Mon, 8 May 2023 03:05:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 255A81A4AB; Mon, 8 May 2023 00:04:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A52F661F93; Mon, 8 May 2023 07:04:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 53317C4339C; Mon, 8 May 2023 07:04:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529475; bh=PS6uh5wyY4cP3z3FsRLfK/TPFwIu+D7LfOjVysY9Mk0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=p7KBlAYirJZQt//TF3ysIWzDilAocdvTP2HTH52YAbAM5UNju9R1ZpKdeX59pj8QK eWnXW66BzWr4iczjl7FiLPAUTondZFOcr6/uhn2yuAQVvkimYUdE24sxy/O3zjMVB1 KwbUK2Fx87db/UGL1kYpM/3Fyt+8xLvW7G+OraGl8CKB4YuX4LQNhVUI64nFFwCMVs RyE/iz5tUdozOJ62Q2M/AM8LAAQJhxvMZRsODojUipi0hTbgmsOghXvmwknqQsjuti Y3Ktp38OfALtBNNBIxkWK+9D14SPSH/wyYOxrNwvCjnqssVN9VenU1x78/MmZCqb0Z cLXwUsavUVZLg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 13/20] x86: decompressor: Factor out kernel decompression and relocation Date: Mon, 8 May 2023 09:03:23 +0200 Message-Id: <20230508070330.582131-14-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2967; i=ardb@kernel.org; h=from:subject; bh=PS6uh5wyY4cP3z3FsRLfK/TPFwIu+D7LfOjVysY9Mk0=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3s4F7BdX2XS7fxG0VGxmDYx2tVrhIJHIus1JJbs2q qbARKujlIVBjINBVkyRRWD233c7T0+UqnWeJQszh5UJZAgDF6cATCTAh5HhuU5p+Ttz/pS41LbY yN5Jpz+GvY07INFnaVLUznVM80oyw/+S52qN15bUCCu8Mt3IvumxpN2mwuLpj/eIdF6e1tC0L4o NAA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Factor out the decompressor sequence that invokes the decompressor, parses the ELF and applies the relocations so that we will be able to call it directly from the EFI stub. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/misc.c | 28 ++++++++++++++++---- arch/x86/include/asm/boot.h | 8 ++++++ 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 1cd40cb9fb4e5027..3635cbeaca1c03cf 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -330,11 +330,33 @@ static size_t parse_elf(void *output) return ehdr.e_entry - LOAD_PHYSICAL_ADDR; } +const unsigned long kernel_total_size = VO__end - VO__text; + static u8 boot_heap[BOOT_HEAP_SIZE] __aligned(4); extern unsigned char input_data[]; extern unsigned int input_len, output_len; +unsigned long decompress_kernel(unsigned char *outbuf, unsigned long virt_addr, + void (*error)(char *x)) +{ + unsigned long entry; + + if (!free_mem_ptr) { + free_mem_ptr = (unsigned long)boot_heap; + free_mem_end_ptr = (unsigned long)boot_heap + sizeof(boot_heap); + } + + if (__decompress(input_data, input_len, NULL, NULL, outbuf, output_len, + NULL, error) < 0) + return ULONG_MAX; + + entry = parse_elf(outbuf); + handle_relocations(outbuf, output_len, virt_addr); + + return entry; +} + /* * The compressed kernel image (ZO), has been moved so that its position * is against the end of the buffer used to hold the uncompressed kernel @@ -354,7 +376,6 @@ extern unsigned int input_len, output_len; */ asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output) { - const unsigned long kernel_total_size = VO__end - VO__text; unsigned long virt_addr = LOAD_PHYSICAL_ADDR; memptr heap = (memptr)boot_heap; unsigned long needed_size; @@ -457,10 +478,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output) #endif debug_putstr("\nDecompressing Linux... "); - __decompress(input_data, input_len, NULL, NULL, output, output_len, - NULL, error); - entry_offset = parse_elf(output); - handle_relocations(output, output_len, virt_addr); + entry_offset = decompress_kernel(output, virt_addr, error); debug_putstr("done.\nBooting the kernel (entry_offset: 0x"); debug_puthex(entry_offset); diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index 9191280d9ea3160d..4ae14339cb8cc72d 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -62,4 +62,12 @@ # define BOOT_STACK_SIZE 0x1000 #endif +#ifndef __ASSEMBLY__ +extern unsigned int output_len; +extern const unsigned long kernel_total_size; + +unsigned long decompress_kernel(unsigned char *outbuf, unsigned long virt_addr, + void (*error)(char *x)); +#endif + #endif /* _ASM_X86_BOOT_H */ From patchwork Mon May 8 07:03:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86C92C77B75 for ; Mon, 8 May 2023 07:05:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232868AbjEHHF6 (ORCPT ); Mon, 8 May 2023 03:05:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232870AbjEHHFP (ORCPT ); Mon, 8 May 2023 03:05:15 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0FBC1A1E3; Mon, 8 May 2023 00:04:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DE58661F98; Mon, 8 May 2023 07:04:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 874FEC433EF; Mon, 8 May 2023 07:04:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529479; bh=RgKaN65vbfxpRIG/ejUZYCBYrcPUPZF6ystsCPDQ7Rc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=U3LlLuRVN5uMD2fd8UqcE3lOgORz52I9N0Ssuenr3NCowef4f5wnza2DbafUlNOUe AjWBZNvockmtJT4Ioz5W/uzfqjrUZdAruuQnJoj1eX2QNaveDHymG39BcjTIhfS/sQ VpJFvUzeVAwXMjBKkQOkv7tlLeKusrZA8YFbFeHBzV7/YMzKah/D0gCuBdNKlB+pCf T2BUnMPE54o/6FzoIa4lOQeDeVbRm2DUObmEYBQPhllhSYAzKn/JR2U2SrhnfuI3rj llMFHcInfOBTpMLyHWY05znOYxJUZOSRzMi0WZ3zhJo3Aeg0U0WaTUxyfhx+tO0FH1 3RFCrdi1qQytA== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 14/20] x86: head_64: Store boot_params pointer in callee-preserved register Date: Mon, 8 May 2023 09:03:24 +0200 Message-Id: <20230508070330.582131-15-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2417; i=ardb@kernel.org; h=from:subject; bh=RgKaN65vbfxpRIG/ejUZYCBYrcPUPZF6ystsCPDQ7Rc=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3m6fqRP3JLw1t71ma9maM/n/Ff/Jn8+XVdzWdMvXt ulgfJvTUcrCIMbBICumyCIw+++7nacnStU6z5KFmcPKBDKEgYtTACai8J3hv+crjTrDVTZdTrI1 PuG+Kz1Fj6yweTf/6PMTRmq+3Vcqoxj+GQWmNnOvsFm+O7DkfhUTzzLD/xbBoc2/Q1dV9naVuvz mAwA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Instead of pushing/popping %RSI to/from the stack every time we call a function from startup_64(), just store it in a callee preserved register until we're ready to pass it on. Signed-off-by: Ard Biesheuvel --- arch/x86/kernel/head_64.S | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index a5df3e994f04f10f..95b12fdae10e1dc9 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -60,6 +60,7 @@ SYM_CODE_START_NOALIGN(startup_64) * compiled to run at we first fixup the physical addresses in our page * tables and then reload them. */ + mov %rsi, %r15 /* Preserve boot_params pointer */ /* Set up the stack for verify_cpu() */ leaq (__end_init_task - PTREGS_SIZE)(%rip), %rsp @@ -73,9 +74,7 @@ SYM_CODE_START_NOALIGN(startup_64) shrq $32, %rdx wrmsr - pushq %rsi call startup_64_setup_env - popq %rsi #ifdef CONFIG_AMD_MEM_ENCRYPT /* @@ -84,10 +83,8 @@ SYM_CODE_START_NOALIGN(startup_64) * which needs to be done before any CPUID instructions are executed in * subsequent code. */ - movq %rsi, %rdi - pushq %rsi + movq %r15, %rdi call sme_enable - popq %rsi #endif /* Now switch to __KERNEL_CS so IRET works reliably */ @@ -109,9 +106,7 @@ SYM_CODE_START_NOALIGN(startup_64) * programmed into CR3. */ leaq _text(%rip), %rdi - pushq %rsi call __startup_64 - popq %rsi /* Form the CR3 value being sure to include the CR3 modifier */ addq $(early_top_pgt - __START_KERNEL_map), %rax @@ -200,10 +195,8 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) * %rsi carries pointer to realmode data and is callee-clobbered. Save * and restore it. */ - pushq %rsi movq %rax, %rdi call sev_verify_cbit - popq %rsi /* * Switch to new page-table @@ -294,9 +287,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) wrmsr /* Setup and Load IDT */ - pushq %rsi call early_setup_idt - popq %rsi /* Check if nx is implemented */ movl $0x80000001, %eax @@ -334,7 +325,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) /* rsi is pointer to real mode structure with interesting info. pass it to C */ - movq %rsi, %rdi + movq %r15, %rdi .Ljump_to_C_code: /* From patchwork Mon May 8 07:03:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3AF4DC77B75 for ; Mon, 8 May 2023 07:06:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233108AbjEHHGC (ORCPT ); Mon, 8 May 2023 03:06:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233110AbjEHHF2 (ORCPT ); Mon, 8 May 2023 03:05:28 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 151221A624; Mon, 8 May 2023 00:04:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1C3A461F94; Mon, 8 May 2023 07:04:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BBA35C4339E; Mon, 8 May 2023 07:04:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529483; bh=pOa0uLaUQL7guzGpAfdDsFjGbf6C6UXPFUPb62ISdKo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MV7+bAoMq5TAPFw31XJPnDEkdUatdL34ctMBvUACIDpuRCOIQxcv+cN/fP7ZodXf0 gOEE+FpmtfxP4aUbUKQeWKvA4ei6eyh1fVC/D7JGOPc6yYKBKoqEZCaWgsl5YTQ9yW N/88DSSx10p7kdCFoonhniqFcxDqBFvVi29/SLa54MQp45eXYT1KqZEF/tK86jsno9 ToHNSl7xjju+mjwvvGVfSe08BXB/s5Rwp2sOFK3wDPbbMikayQZp7EjgP5soEjJMUo 56dTotecWxzGNGQYRBDrSEqJlRSs4HxPqfc5X/98Dr+kYI4wDUiTA31yguu5keD/nv jglyCJqT6Rqrg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 15/20] x86: head_64: Switch to kernel CS before enabling memory encryption Date: Mon, 8 May 2023 09:03:25 +0200 Message-Id: <20230508070330.582131-16-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1479; i=ardb@kernel.org; h=from:subject; bh=pOa0uLaUQL7guzGpAfdDsFjGbf6C6UXPFUPb62ISdKo=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3p5L2//eS8mLWR1Vc7Go4HNebujRny93XNr8vmr6W nOJ9WcTO0pZGMQ4GGTFFFkEZv99t/P0RKla51myMHNYmUCGMHBxCsBEjt9mZPiyxNAv8PrbkOz5 Ey1z2rK7PW887NTq/fNOhIE1+6u5jC0jwzUfW7MwB4fv+6/4zvfYoHOyTognumiRo2m0bNPrFUs CuAE= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The SME initialization triggers #VC exceptions due to the use of CPUID instructions, and returning from an exception restores the code segment that was active when the exception was taken. This means we should ensure that we switch the code segment to one that is described in the GDT we just loaded before running the SME init code. Reported-by: Tom Lendacky Signed-off-by: Ard Biesheuvel --- arch/x86/kernel/head_64.S | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 95b12fdae10e1dc9..a128ac62956ff7c4 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -76,6 +76,15 @@ SYM_CODE_START_NOALIGN(startup_64) call startup_64_setup_env + /* Now switch to __KERNEL_CS so IRET works reliably */ + pushq $__KERNEL_CS + leaq .Lon_kernel_cs(%rip), %rax + pushq %rax + lretq + +.Lon_kernel_cs: + UNWIND_HINT_END_OF_STACK + #ifdef CONFIG_AMD_MEM_ENCRYPT /* * Activate SEV/SME memory encryption if supported/enabled. This needs to @@ -87,15 +96,6 @@ SYM_CODE_START_NOALIGN(startup_64) call sme_enable #endif - /* Now switch to __KERNEL_CS so IRET works reliably */ - pushq $__KERNEL_CS - leaq .Lon_kernel_cs(%rip), %rax - pushq %rax - lretq - -.Lon_kernel_cs: - UNWIND_HINT_END_OF_STACK - /* Sanitize CPU configuration */ call verify_cpu From patchwork Mon May 8 07:03:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 192BFC77B75 for ; Mon, 8 May 2023 07:06:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233178AbjEHHGI (ORCPT ); Mon, 8 May 2023 03:06:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233184AbjEHHFb (ORCPT ); Mon, 8 May 2023 03:05:31 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 167351A1E6; Mon, 8 May 2023 00:04:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4A8DE61F96; Mon, 8 May 2023 07:04:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE786C433D2; Mon, 8 May 2023 07:04:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529487; bh=E2w5aloeuQ0G0hJL5MGQzf9qEE3fMq6Wyl1+QQk0LOk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rEmNJbMdQdN9J83vYxRk8fdLdzgUe/W7/42WwfYmQZ+6okO4dTvwP3Oqtn+FS6kLw 7BaoY5LWJ7u0BSGYiKlA+IRjpwn5UeguQwng0Mr97nRmTqb00gA8Kk4EhKqD0I994u HnAhls0NapOM1aZP1nOThCVKCT6fREG1ZizEz9LViNtbNRt4bPa7sDfTEmybspRnTz NuvumijF9s5/VgSHGQ6vzz95/CRwqOxgWBVSsg30vS37e+AlmJeZSno7pqR5c3MVpy ksbwDGNpjOiEPjzOxvbLio1EesEw9yZM6YZiHdO3rdgUHgcZjEEkvjFM1vaPUDix4/ 4Y+0FdMaHQGXA== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 16/20] efi: libstub: Add limit argument to efi_random_alloc() Date: Mon, 8 May 2023 09:03:26 +0200 Message-Id: <20230508070330.582131-17-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3943; i=ardb@kernel.org; h=from:subject; bh=E2w5aloeuQ0G0hJL5MGQzf9qEE3fMq6Wyl1+QQk0LOk=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3t65TNfa731pL/csfX3/EOPXLP61z46d+qxysu3di kNnHy+M7ShlYRDjYJAVU2QRmP333c7TE6VqnWfJwsxhZQIZwsDFKQATSTRg+Ct6fcqcrfzbBF+/ uct6wfq+ZP6Z87v2ql3gPS+fqtMl/mAKw39vth1OX1oXXth2ittfr0yn/Wn82ZYbmziFlj8sVbu 5rI8PAA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org x86 will need to limit the kernel memory allocation to the lowest 512 MiB of memory, to match the behavior of the existing bare metal KASLR physical randomization logic. So in preparation for that, add a limit parameter to efi_random_alloc() and wire it up. Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/arm64-stub.c | 2 +- drivers/firmware/efi/libstub/efistub.h | 2 +- drivers/firmware/efi/libstub/randomalloc.c | 10 ++++++---- drivers/firmware/efi/libstub/zboot.c | 2 +- 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c index 770b8ecb73984c61..8c40fc89f5f99209 100644 --- a/drivers/firmware/efi/libstub/arm64-stub.c +++ b/drivers/firmware/efi/libstub/arm64-stub.c @@ -106,7 +106,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr, */ status = efi_random_alloc(*reserve_size, min_kimg_align, reserve_addr, phys_seed, - EFI_LOADER_CODE); + EFI_LOADER_CODE, EFI_ALLOC_LIMIT); if (status != EFI_SUCCESS) efi_warn("efi_random_alloc() failed: 0x%lx\n", status); } else { diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h index 67d5a20802e0b7c6..03e3cec87ffbe2d1 100644 --- a/drivers/firmware/efi/libstub/efistub.h +++ b/drivers/firmware/efi/libstub/efistub.h @@ -955,7 +955,7 @@ efi_status_t efi_get_random_bytes(unsigned long size, u8 *out); efi_status_t efi_random_alloc(unsigned long size, unsigned long align, unsigned long *addr, unsigned long random_seed, - int memory_type); + int memory_type, unsigned long alloc_limit); efi_status_t efi_random_get_seed(void); diff --git a/drivers/firmware/efi/libstub/randomalloc.c b/drivers/firmware/efi/libstub/randomalloc.c index 32c7a54923b4c127..674a064b8f7adc68 100644 --- a/drivers/firmware/efi/libstub/randomalloc.c +++ b/drivers/firmware/efi/libstub/randomalloc.c @@ -16,7 +16,8 @@ */ static unsigned long get_entry_num_slots(efi_memory_desc_t *md, unsigned long size, - unsigned long align_shift) + unsigned long align_shift, + u64 alloc_limit) { unsigned long align = 1UL << align_shift; u64 first_slot, last_slot, region_end; @@ -29,7 +30,7 @@ static unsigned long get_entry_num_slots(efi_memory_desc_t *md, return 0; region_end = min(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - 1, - (u64)EFI_ALLOC_LIMIT); + alloc_limit); if (region_end < size) return 0; @@ -54,7 +55,8 @@ efi_status_t efi_random_alloc(unsigned long size, unsigned long align, unsigned long *addr, unsigned long random_seed, - int memory_type) + int memory_type, + unsigned long alloc_limit) { unsigned long total_slots = 0, target_slot; unsigned long total_mirrored_slots = 0; @@ -76,7 +78,7 @@ efi_status_t efi_random_alloc(unsigned long size, efi_memory_desc_t *md = (void *)map->map + map_offset; unsigned long slots; - slots = get_entry_num_slots(md, size, ilog2(align)); + slots = get_entry_num_slots(md, size, ilog2(align), alloc_limit); MD_NUM_SLOTS(md) = slots; total_slots += slots; if (md->attribute & EFI_MEMORY_MORE_RELIABLE) diff --git a/drivers/firmware/efi/libstub/zboot.c b/drivers/firmware/efi/libstub/zboot.c index e5d7fa1f1d8fd160..bdb17eac0cb401be 100644 --- a/drivers/firmware/efi/libstub/zboot.c +++ b/drivers/firmware/efi/libstub/zboot.c @@ -119,7 +119,7 @@ efi_zboot_entry(efi_handle_t handle, efi_system_table_t *systab) } status = efi_random_alloc(alloc_size, min_kimg_align, &image_base, - seed, EFI_LOADER_CODE); + seed, EFI_LOADER_CODE, EFI_ALLOC_LIMIT); if (status != EFI_SUCCESS) { efi_err("Failed to allocate memory\n"); goto free_cmdline; From patchwork Mon May 8 07:03:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680030 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22766C7EE24 for ; Mon, 8 May 2023 07:06:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233210AbjEHHGO (ORCPT ); Mon, 8 May 2023 03:06:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233215AbjEHHFc (ORCPT ); Mon, 8 May 2023 03:05:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E41E1A48B; Mon, 8 May 2023 00:04:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7F1ED61FAB; Mon, 8 May 2023 07:04:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2CC39C4339B; Mon, 8 May 2023 07:04:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529491; bh=1s9q3daoOQOBrHBekUCXV36l5RIRydobOaLGHT6RGX8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dizxd39pyJpk7qPy00tyI7ARKRjwRCg6IbXhfAaFqS0fOwEzHwhyiE+1QoeLCnLWc JbwYT6S6c4DtlbA5C1sv7hulLiRj01KNcNbkMnI4IBM71JZicp1+sx4Y2SOVYVhyM/ h0v/3N5R92HJjDjb5bhpKH7mVE++907eyPB6y/nfWDZjYjz/v8eFehJBgDtK4iu1l4 dRcdVERloU/AOCE6ZVaSqPMNO9EvQvGNjCfTx9bIUAbF/93bLXpxKOsZ5hqyXAnZiA JNkvWN2KtBVLrJ6h3vzE8uBrV+dAlfdod0j7SUY7lKs9ANMfjXOw+7ymkaABdHNuhW OQara+MdLdPpQ== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 17/20] x86: efistub: Check SEV/SNP support while running in the firmware Date: Mon, 8 May 2023 09:03:27 +0200 Message-Id: <20230508070330.582131-18-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4790; i=ardb@kernel.org; h=from:subject; bh=1s9q3daoOQOBrHBekUCXV36l5RIRydobOaLGHT6RGX8=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3j5DUbGKXbHtG99vV/+zWC9A+SD3nk2i2485Lvu/7 krjnd3MHaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAiHy8z/NNLYbZetyhzwqPG +cWubxYvXOhTfb7+m8JZxguvt+/iqGhhZNgcumHK2ei/33QOaG/ucT5cvmRaaPePdV99Jtzf4LF 0Bws3AA== X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The decompressor executes in an environment with little or no access to a console, and without any ability to return an error back to the caller (the bootloader). So the only recourse we have when the SEV/SNP context is not quite what the kernel expects is to terminate the guest entirely. This is a bit harsh, and also unnecessary when booting via the EFI stub, given that it provides all the support that SEV guests need to probe the underlying platform. So let's do the SEV initialization and SNP feature check before calling ExitBootServices(), and simply return with an error if the SNP feature mask is not as expected. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/sev.c | 12 ++++++++---- arch/x86/include/asm/sev.h | 4 ++++ drivers/firmware/efi/libstub/x86-stub.c | 17 +++++++++++++++++ 3 files changed, 29 insertions(+), 4 deletions(-) diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index 014b89c890887b9a..19c40873fdd209b5 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -315,20 +315,24 @@ static void enforce_vmpl0(void) */ #define SNP_FEATURES_PRESENT (0) +u64 snp_get_unsupported_features(void) +{ + if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED)) + return 0; + return sev_status & SNP_FEATURES_IMPL_REQ & ~SNP_FEATURES_PRESENT; +} + void snp_check_features(void) { u64 unsupported; - if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED)) - return; - /* * Terminate the boot if hypervisor has enabled any feature lacking * guest side implementation. Pass on the unsupported features mask through * EXIT_INFO_2 of the GHCB protocol so that those features can be reported * as part of the guest boot failure. */ - unsupported = sev_status & SNP_FEATURES_IMPL_REQ & ~SNP_FEATURES_PRESENT; + unsupported = snp_get_unsupported_features(); if (unsupported) { if (ghcb_version < 2 || (!boot_ghcb && !early_setup_ghcb())) sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index 13dc2a9d23c1eb25..bf27b91644d0da5a 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -157,6 +157,7 @@ static __always_inline void sev_es_nmi_complete(void) __sev_es_nmi_complete(); } extern int __init sev_es_efi_map_ghcbs(pgd_t *pgd); +extern void sev_enable(struct boot_params *bp); static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { @@ -202,12 +203,14 @@ void snp_set_wakeup_secondary_cpu(void); bool snp_init(struct boot_params *bp); void __init __noreturn snp_abort(void); int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio); +u64 snp_get_unsupported_features(void); #else static inline void sev_es_ist_enter(struct pt_regs *regs) { } static inline void sev_es_ist_exit(void) { } static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { return 0; } static inline void sev_es_nmi_complete(void) { } static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; } +static inline void sev_enable(struct boot_params *bp) { } static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; } static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; } static inline void setup_ghcb(void) { } @@ -225,6 +228,7 @@ static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *in { return -ENOTTY; } +static inline u64 snp_get_unsupported_features(void) { return 0; } #endif #endif diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index ce8434fce0c37982..33d11ba78f1d8c4f 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "efistub.h" @@ -714,6 +715,22 @@ static efi_status_t exit_boot_func(struct efi_boot_memmap *map, &p->efi->efi_memmap, &p->efi->efi_memmap_hi); p->efi->efi_memmap_size = map->map_size; + /* + * Call the SEV init code while still running with the firmware's + * GDT/IDT, so #VC exceptions will be handled by EFI. + */ + if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) { + u64 unsupported; + + sev_enable(p->boot_params); + unsupported = snp_get_unsupported_features(); + if (unsupported) { + efi_err("Unsupported SEV-SNP features detected: 0x%llx\n", + unsupported); + return EFI_UNSUPPORTED; + } + } + return EFI_SUCCESS; } From patchwork Mon May 8 07:03:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94599C77B75 for ; Mon, 8 May 2023 07:06:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233368AbjEHHGh (ORCPT ); Mon, 8 May 2023 03:06:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232716AbjEHHF4 (ORCPT ); Mon, 8 May 2023 03:05:56 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC9C61C0D9; Mon, 8 May 2023 00:05:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D7BAE61268; Mon, 8 May 2023 07:04:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5FB21C433A1; Mon, 8 May 2023 07:04:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529496; bh=NzXcIJDJup/xDczhmoqjUR9K76vkUkqDWjo0fYkQ7fo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TUe8xIXxcD22UJEmAqp5BOq1ArXPI9mCcl7RSEFspfUweP3+wq5pPnI41AObSwV4l qdXY+Vg/fgHHuLYM2k6TKwcujIQE+fUoXF1CP5duo1tE16zBZno5DS/DLqfFlDIeJS 0Oz4bhjkYkJFuedlIQE1jUOjPjH2tno82y7hyrINHrUVAxug078JBCuDQB7fLTMXOO DsMsmJhAHUsHCtyjUEhFPBM5Rog9O+hj+4qlVzO2mMKO9vxv6O4m4Shv4XkXLs4wLj fyLKywp3P/ZoYarsDqPITzKjtlSMCaVhRgmUu4smb5OfgaVbhZc3fhFtVhWKmOkN8s XWDaTh3PkNDQQ== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 18/20] x86: efistub: Avoid legacy decompressor when doing EFI boot Date: Mon, 8 May 2023 09:03:28 +0200 Message-Id: <20230508070330.582131-19-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=19188; i=ardb@kernel.org; h=from:subject; bh=NzXcIJDJup/xDczhmoqjUR9K76vkUkqDWjo0fYkQ7fo=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3oFHLxj/OBnkRH8J21r1oVCdac7s+Iy5b4rtDh3LW ijyQP1aRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZhIRw7DH46pdQ4y5jERogqu x1Rilx7yfrNhdtbP+hgb7X9Bj6MW/mBkmCQrcDmveeG/h5t/O57deWL7PaXk5vClU2SaAoozA3+ f5AMA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The bare metal decompressor code was never really intended to run in a hosted environment such as the EFI boot services, and does a few things that are problematic in the context of EFI boot now that the logo requirements are getting tighter. In particular, the decompressor moves its own executable image around in memory, and relies on demand paging to populate the identity mappings, and these things are difficult to support in a context where memory is not permitted to be mapped writable and executable at the same time or, at the very least, is mapped non-executable by default, and needs special treatment for this restriction to be lifted. Since EFI already maps all of memory 1:1, we don't need to create new page tables or handle page faults when decompressing the kernel. That means there is also no need to replace the special exception handlers for SEV. Generally, there is little need to do anything that the decompressor does beyond - initialize SEV encryption, if needed, - perform the 4/5 level paging switch, if needed, - decompress the kernel - relocate the kernel So let's do all of this from the EFI stub code, and avoid the bare metal decompressor altogether. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/Makefile | 5 + arch/x86/boot/compressed/efi_mixed.S | 58 +------ arch/x86/boot/compressed/head_32.S | 22 +-- arch/x86/boot/compressed/head_64.S | 34 ---- arch/x86/include/asm/efi.h | 7 +- drivers/firmware/efi/libstub/x86-stub.c | 176 +++++++++----------- 6 files changed, 95 insertions(+), 207 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 6b6cfe607bdbdfaa..f3289f17f820f982 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -74,6 +74,11 @@ LDFLAGS_vmlinux += -z noexecstack ifeq ($(CONFIG_LD_IS_BFD),y) LDFLAGS_vmlinux += $(call ld-option,--no-warn-rwx-segments) endif +ifeq ($(CONFIG_EFI_STUB),y) +# ensure that we'll pull in the static EFI stub library even if we never +# reference it from the startup code +LDFLAGS_vmlinux += -u efi_pe_entry +endif LDFLAGS_vmlinux += -T hostprogs := mkpiggy diff --git a/arch/x86/boot/compressed/efi_mixed.S b/arch/x86/boot/compressed/efi_mixed.S index 4ca70bf93dc0bdcd..dec579eb1caa16d5 100644 --- a/arch/x86/boot/compressed/efi_mixed.S +++ b/arch/x86/boot/compressed/efi_mixed.S @@ -49,9 +49,12 @@ SYM_FUNC_START(startup_64_mixed_mode) lea efi32_boot_args(%rip), %rdx mov 0(%rdx), %edi mov 4(%rdx), %esi +#ifdef CONFIG_EFI_HANDOVER_PROTOCOL mov 8(%rdx), %edx // saved bootparams pointer test %edx, %edx jnz efi64_stub_entry +#endif + /* * efi_pe_entry uses MS calling convention, which requires 32 bytes of * shadow space on the stack even if all arguments are passed in @@ -245,10 +248,6 @@ SYM_FUNC_START(efi32_entry) jmp startup_32 SYM_FUNC_END(efi32_entry) -#define ST32_boottime 60 // offsetof(efi_system_table_32_t, boottime) -#define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol) -#define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base) - /* * efi_status_t efi32_pe_entry(efi_handle_t image_handle, * efi_system_table_32_t *sys_table) @@ -256,8 +255,6 @@ SYM_FUNC_END(efi32_entry) SYM_FUNC_START(efi32_pe_entry) pushl %ebp movl %esp, %ebp - pushl %eax // dummy push to allocate loaded_image - pushl %ebx // save callee-save registers pushl %edi @@ -266,48 +263,8 @@ SYM_FUNC_START(efi32_pe_entry) movl $0x80000003, %eax // EFI_UNSUPPORTED jnz 2f - call 1f -1: pop %ebx - - /* Get the loaded image protocol pointer from the image handle */ - leal -4(%ebp), %eax - pushl %eax // &loaded_image - leal (loaded_image_proto - 1b)(%ebx), %eax - pushl %eax // pass the GUID address - pushl 8(%ebp) // pass the image handle - - /* - * Note the alignment of the stack frame. - * sys_table - * handle <-- 16-byte aligned on entry by ABI - * return address - * frame pointer - * loaded_image <-- local variable - * saved %ebx <-- 16-byte aligned here - * saved %edi - * &loaded_image - * &loaded_image_proto - * handle <-- 16-byte aligned for call to handle_protocol - */ - - movl 12(%ebp), %eax // sys_table - movl ST32_boottime(%eax), %eax // sys_table->boottime - call *BS32_handle_protocol(%eax) // sys_table->boottime->handle_protocol - addl $12, %esp // restore argument space - testl %eax, %eax - jnz 2f - movl 8(%ebp), %ecx // image_handle movl 12(%ebp), %edx // sys_table - movl -4(%ebp), %esi // loaded_image - movl LI32_image_base(%esi), %esi // loaded_image->image_base - leal (startup_32 - 1b)(%ebx), %ebp // runtime address of startup_32 - /* - * We need to set the image_offset variable here since startup_32() will - * use it before we get to the 64-bit efi_pe_entry() in C code. - */ - subl %esi, %ebp // calculate image_offset - movl %ebp, (image_offset - 1b)(%ebx) // save image_offset xorl %esi, %esi jmp efi32_entry // pass %ecx, %edx, %esi // no other registers remain live @@ -318,15 +275,6 @@ SYM_FUNC_START(efi32_pe_entry) RET SYM_FUNC_END(efi32_pe_entry) - .section ".rodata" - /* EFI loaded image protocol GUID */ - .balign 4 -SYM_DATA_START_LOCAL(loaded_image_proto) - .long 0x5b1b31a1 - .word 0x9562, 0x11d2 - .byte 0x8e, 0x3f, 0x00, 0xa0, 0xc9, 0x69, 0x72, 0x3b -SYM_DATA_END(loaded_image_proto) - .data .balign 8 SYM_DATA_START_LOCAL(efi32_boot_gdt) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index 8ee8749007595fcc..3f9b80726070a8e7 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -84,19 +84,6 @@ SYM_FUNC_START(startup_32) #ifdef CONFIG_RELOCATABLE leal startup_32@GOTOFF(%edx), %ebx - -#ifdef CONFIG_EFI_STUB -/* - * If we were loaded via the EFI LoadImage service, startup_32() will be at an - * offset to the start of the space allocated for the image. efi_pe_entry() will - * set up image_offset to tell us where the image actually starts, so that we - * can use the full available buffer. - * image_offset = startup_32 - image_base - * Otherwise image_offset will be zero and has no effect on the calculations. - */ - subl image_offset@GOTOFF(%edx), %ebx -#endif - movl BP_kernel_alignment(%esi), %eax decl %eax addl %eax, %ebx @@ -150,15 +137,10 @@ SYM_FUNC_START(startup_32) jmp *%eax SYM_FUNC_END(startup_32) -#ifdef CONFIG_EFI_STUB +#ifdef CONFIG_EFI_HANDOVER_PROTOCOL SYM_FUNC_START(efi32_stub_entry) - add $0x4, %esp - movl 8(%esp), %esi /* save boot_params pointer */ - call efi_main - /* efi_main returns the possibly relocated address of startup_32 */ - jmp *%eax + jmp efi_main SYM_FUNC_END(efi32_stub_entry) -SYM_FUNC_ALIAS(efi_stub_entry, efi32_stub_entry) #endif .text diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index bcf678aed81468e3..320e2825ff0b32da 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -146,19 +146,6 @@ SYM_FUNC_START(startup_32) #ifdef CONFIG_RELOCATABLE movl %ebp, %ebx - -#ifdef CONFIG_EFI_STUB -/* - * If we were loaded via the EFI LoadImage service, startup_32 will be at an - * offset to the start of the space allocated for the image. efi_pe_entry will - * set up image_offset to tell us where the image actually starts, so that we - * can use the full available buffer. - * image_offset = startup_32 - image_base - * Otherwise image_offset will be zero and has no effect on the calculations. - */ - subl rva(image_offset)(%ebp), %ebx -#endif - movl BP_kernel_alignment(%esi), %eax decl %eax addl %eax, %ebx @@ -346,20 +333,6 @@ SYM_CODE_START(startup_64) /* Start with the delta to where the kernel will run at. */ #ifdef CONFIG_RELOCATABLE leaq startup_32(%rip) /* - $startup_32 */, %rbp - -#ifdef CONFIG_EFI_STUB -/* - * If we were loaded via the EFI LoadImage service, startup_32 will be at an - * offset to the start of the space allocated for the image. efi_pe_entry will - * set up image_offset to tell us where the image actually starts, so that we - * can use the full available buffer. - * image_offset = startup_32 - image_base - * Otherwise image_offset will be zero and has no effect on the calculations. - */ - movl image_offset(%rip), %eax - subq %rax, %rbp -#endif - movl BP_kernel_alignment(%rsi), %eax decl %eax addq %rax, %rbp @@ -481,19 +454,12 @@ SYM_CODE_START(startup_64) jmp *%rax SYM_CODE_END(startup_64) -#ifdef CONFIG_EFI_STUB #ifdef CONFIG_EFI_HANDOVER_PROTOCOL .org 0x390 -#endif SYM_FUNC_START(efi64_stub_entry) and $~0xf, %rsp /* realign the stack */ - movq %rdx, %rbx /* save boot_params pointer */ call efi_main - movq %rbx,%rsi - leaq rva(startup_64)(%rax), %rax - jmp *%rax SYM_FUNC_END(efi64_stub_entry) -SYM_FUNC_ALIAS(efi_stub_entry, efi64_stub_entry) #endif .text diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h index 419280d263d2e3f2..24fa828eeef7d9dd 100644 --- a/arch/x86/include/asm/efi.h +++ b/arch/x86/include/asm/efi.h @@ -88,6 +88,8 @@ static inline void efi_fpu_end(void) } #ifdef CONFIG_X86_32 +#define EFI_X86_KERNEL_ALLOC_LIMIT (SZ_512M - 1) + #define arch_efi_call_virt_setup() \ ({ \ efi_fpu_begin(); \ @@ -101,8 +103,7 @@ static inline void efi_fpu_end(void) }) #else /* !CONFIG_X86_32 */ - -#define EFI_LOADER_SIGNATURE "EL64" +#define EFI_X86_KERNEL_ALLOC_LIMIT EFI_ALLOC_LIMIT extern asmlinkage u64 __efi_call(void *fp, ...); @@ -216,6 +217,8 @@ efi_status_t efi_set_virtual_address_map(unsigned long memory_map_size, #ifdef CONFIG_EFI_MIXED +#define EFI_ALLOC_LIMIT (efi_is_64bit() ? ULONG_MAX : U32_MAX) + #define ARCH_HAS_EFISTUB_WRAPPERS static inline bool efi_is_64bit(void) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 33d11ba78f1d8c4f..59076e16c1ac11ee 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -15,16 +15,13 @@ #include #include #include +#include #include #include "efistub.h" -/* Maximum physical address for 64-bit kernel with 4-level paging */ -#define MAXMEM_X86_64_4LEVEL (1ull << 46) - const efi_system_table_t *efi_system_table; const efi_dxe_services_table_t *efi_dxe_table; -u32 image_offset __section(".data"); static efi_loaded_image_t *image = NULL; static efi_memory_attribute_protocol_t *memattr; @@ -275,33 +272,9 @@ adjust_memory_range_protection(unsigned long start, unsigned long size) } } -void startup_32(struct boot_params *boot_params); - -static void -setup_memory_protection(unsigned long image_base, unsigned long image_size) -{ -#ifdef CONFIG_64BIT - if (image_base != (unsigned long)startup_32) - adjust_memory_range_protection(image_base, image_size); -#else - /* - * Clear protection flags on a whole range of possible - * addresses used for KASLR. We don't need to do that - * on x86_64, since KASLR/extraction is performed after - * dedicated identity page tables are built and we only - * need to remove possible protection on relocated image - * itself disregarding further relocations. - */ - adjust_memory_range_protection(LOAD_PHYSICAL_ADDR, - KERNEL_IMAGE_SIZE - LOAD_PHYSICAL_ADDR); -#endif -} - static const efi_char16_t apple[] = L"Apple"; -static void setup_quirks(struct boot_params *boot_params, - unsigned long image_base, - unsigned long image_size) +static void setup_quirks(struct boot_params *boot_params) { efi_char16_t *fw_vendor = (efi_char16_t *)(unsigned long) efi_table_attr(efi_system_table, fw_vendor); @@ -310,9 +283,6 @@ static void setup_quirks(struct boot_params *boot_params, if (IS_ENABLED(CONFIG_APPLE_PROPERTIES)) retrieve_apple_device_properties(boot_params); } - - if (IS_ENABLED(CONFIG_EFI_DXE_MEM_ATTRIBUTES)) - setup_memory_protection(image_base, image_size); } /* @@ -432,6 +402,7 @@ static void __noreturn efi_exit(efi_handle_t handle, efi_status_t status) asm("hlt"); } +static __alias(efi_main) void __noreturn efi_stub_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg, struct boot_params *boot_params); @@ -465,7 +436,6 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, } image_base = efi_table_attr(image, image_base); - image_offset = (void *)startup_32 - image_base; status = efi_allocate_pages(sizeof(struct boot_params), (unsigned long *)&boot_params, ULONG_MAX); @@ -853,20 +823,82 @@ static void efi_5level_switch(void) #endif } +static void efi_get_seed(void *seed, int size) +{ + efi_get_random_bytes(size, seed); + + /* + * This only updates seed[0] when running on 32-bit, but in that case, + * we don't use seed[1] anyway, as there is no virtual KASLR on 32-bit. + */ + *(unsigned long *)seed ^= kaslr_get_random_long("EFI"); +} + +static void error(char *str) +{ + efi_warn("Decompression failed: %s\n", str); +} + +static efi_status_t efi_decompress_kernel(unsigned long *kernel_entry) +{ + unsigned long virt_addr = LOAD_PHYSICAL_ADDR; + unsigned long addr, alloc_size, entry; + efi_status_t status; + u32 seed[2] = {}; + + /* determine the required size of the allocation */ + alloc_size = ALIGN(max((unsigned long)output_len, kernel_total_size), + MIN_KERNEL_ALIGN); + + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && !efi_nokaslr) { + u64 range = KERNEL_IMAGE_SIZE - LOAD_PHYSICAL_ADDR - kernel_total_size; + + efi_get_seed(seed, sizeof(seed)); + + virt_addr += (range * seed[1]) >> 32; + virt_addr &= ~(CONFIG_PHYSICAL_ALIGN - 1); + } + + status = efi_random_alloc(alloc_size, CONFIG_PHYSICAL_ALIGN, &addr, + seed[0], EFI_LOADER_CODE, + EFI_X86_KERNEL_ALLOC_LIMIT); + if (status != EFI_SUCCESS) + return status; + + entry = decompress_kernel((void *)addr, virt_addr, error); + if (entry == ULONG_MAX) { + efi_free(alloc_size, addr); + return EFI_LOAD_ERROR; + } + + *kernel_entry = addr + entry; + + adjust_memory_range_protection(addr, kernel_total_size); + + return EFI_SUCCESS; +} + +static void __noreturn enter_kernel(unsigned long kernel_addr, + struct boot_params *boot_params) +{ + /* enter decompressed kernel with boot_params pointer in RSI/ESI */ + asm("jmp *%0"::"r"(kernel_addr), "S"(boot_params)); + + unreachable(); +} + /* - * On success, we return the address of startup_32, which has potentially been - * relocated by efi_relocate_kernel. + * On success, we jump to the relocated kernel directly and never return. * On failure, we exit to the firmware via efi_exit instead of returning. */ -asmlinkage unsigned long efi_main(efi_handle_t handle, - efi_system_table_t *sys_table_arg, - struct boot_params *boot_params) +asmlinkage void __noreturn efi_main(efi_handle_t handle, + efi_system_table_t *sys_table_arg, + struct boot_params *boot_params) { efi_guid_t guid = EFI_MEMORY_ATTRIBUTE_PROTOCOL_GUID; - unsigned long bzimage_addr = (unsigned long)startup_32; - unsigned long buffer_start, buffer_end; struct setup_header *hdr = &boot_params->hdr; const struct linux_efi_initrd *initrd = NULL; + unsigned long kernel_entry; efi_status_t status; efi_system_table = sys_table_arg; @@ -892,60 +924,6 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, goto fail; } - /* - * If the kernel isn't already loaded at a suitable address, - * relocate it. - * - * It must be loaded above LOAD_PHYSICAL_ADDR. - * - * The maximum address for 64-bit is 1 << 46 for 4-level paging. This - * is defined as the macro MAXMEM, but unfortunately that is not a - * compile-time constant if 5-level paging is configured, so we instead - * define our own macro for use here. - * - * For 32-bit, the maximum address is complicated to figure out, for - * now use KERNEL_IMAGE_SIZE, which will be 512MiB, the same as what - * KASLR uses. - * - * Also relocate it if image_offset is zero, i.e. the kernel wasn't - * loaded by LoadImage, but rather by a bootloader that called the - * handover entry. The reason we must always relocate in this case is - * to handle the case of systemd-boot booting a unified kernel image, - * which is a PE executable that contains the bzImage and an initrd as - * COFF sections. The initrd section is placed after the bzImage - * without ensuring that there are at least init_size bytes available - * for the bzImage, and thus the compressed kernel's startup code may - * overwrite the initrd unless it is moved out of the way. - */ - - buffer_start = ALIGN(bzimage_addr - image_offset, - hdr->kernel_alignment); - buffer_end = buffer_start + hdr->init_size; - - if ((buffer_start < LOAD_PHYSICAL_ADDR) || - (IS_ENABLED(CONFIG_X86_32) && buffer_end > KERNEL_IMAGE_SIZE) || - (IS_ENABLED(CONFIG_X86_64) && buffer_end > MAXMEM_X86_64_4LEVEL) || - (image_offset == 0)) { - extern char _bss[]; - - status = efi_relocate_kernel(&bzimage_addr, - (unsigned long)_bss - bzimage_addr, - hdr->init_size, - hdr->pref_address, - hdr->kernel_alignment, - LOAD_PHYSICAL_ADDR); - if (status != EFI_SUCCESS) { - efi_err("efi_relocate_kernel() failed!\n"); - goto fail; - } - /* - * Now that we've copied the kernel elsewhere, we no longer - * have a set up block before startup_32(), so reset image_offset - * to zero in case it was set earlier. - */ - image_offset = 0; - } - #ifdef CONFIG_CMDLINE_BOOL status = efi_parse_options(CONFIG_CMDLINE); if (status != EFI_SUCCESS) { @@ -963,6 +941,12 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, } } + status = efi_decompress_kernel(&kernel_entry); + if (status != EFI_SUCCESS) { + efi_err("Failed to decompress kernel\n"); + goto fail; + } + /* * At this point, an initrd may already have been loaded by the * bootloader and passed via bootparams. We permit an initrd loaded @@ -1002,7 +986,7 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, setup_efi_pci(boot_params); - setup_quirks(boot_params, bzimage_addr, buffer_end - buffer_start); + setup_quirks(boot_params); status = exit_boot(boot_params, handle); if (status != EFI_SUCCESS) { @@ -1012,7 +996,7 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, efi_5level_switch(); - return bzimage_addr; + enter_kernel(kernel_entry, boot_params); fail: efi_err("efi_main() failed!\n"); From patchwork Mon May 8 07:03:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5687DC77B75 for ; Mon, 8 May 2023 07:06:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233284AbjEHHGs (ORCPT ); Mon, 8 May 2023 03:06:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233316AbjEHHGC (ORCPT ); Mon, 8 May 2023 03:06:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 392DF1CFD0; Mon, 8 May 2023 00:05:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1A64261F88; Mon, 8 May 2023 07:05:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BCCD0C4339C; Mon, 8 May 2023 07:04:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529500; bh=cGAUD8iNyl2Jp4BQzqmRm+7QurWozlRhbNmhWh1rRE4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=absfKyJLjfGHwejOEeDaiioXiSZMY0rzsRpNFBFNAQWOx4H1Tv/L5PwA9Pkd36Ger 2xnNBOLrkmdCzT4r9WgZvPg3s/1MDPHkK9puNlgMdkFHlxs/1bpinhKcpBs3EVbH11 vAKYQODpq1s8IWUoMapM4s9WX9pDfmklMV7FV6GERpAqFBuUwux6pYp+WOJS5+VoSh Y1QzfVo+uktl6J3iMaEdh79mH7ymtBga6Ud2WxmJhCJmxREqSIkhGSmBLg1jKwnAof I2gwkhoEF0MZT0+eR3GUTsZYjLmDh60Hz3SKy2V0hWjboO1y5EdTtsFIoEUqwkEJBq JhNqMxM+wJaiQ== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 19/20] x86: efistub: Clear BSS in EFI handover protocol entrypoint Date: Mon, 8 May 2023 09:03:29 +0200 Message-Id: <20230508070330.582131-20-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3719; i=ardb@kernel.org; h=from:subject; bh=cGAUD8iNyl2Jp4BQzqmRm+7QurWozlRhbNmhWh1rRE4=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3kEJLX/JTguLyJS0j8f/LFn797D8j7X6fK2TLOf+e bs69PKbjlIWBjEOBlkxRRaB2X/f7Tw9UarWeZYszBxWJpAhDFycAjCR7iuMDC+K319wWrO2uVju 3+XzecsOd5ezMdfNOqJnH/hy60xBO0uG/3mqC/RPJRV0++mZH3m/2GbqLtMQsZnlmmozOdqfX31 7ngEA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The so-called EFI handover protocol is value-add from the distros that permits a loader to simply copy a PE kernel image into memory and call an alternative entrypoint that is described by an embedded boot_params structure. Most implementations of this protocol do not bother to check the PE header for minimum alignment, section placement, etc, and therefore also don't clear the image's BSS, or even allocate enough memory for it. Allocating more memory on the fly is rather difficult, but let's at least clear the BSS explicitly when entering in this manner, so that the decompressor's pseudo-malloc() does not get confused by global variables that were not zero-initialized correctly. Note that we shouldn't clear BSS before calling efi_main() if we are not booting via the handover protocol, but this entrypoint is no longer shared with the pure EFI entrypoint so we can ignore that case here. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_32.S | 6 ----- arch/x86/boot/compressed/head_64.S | 2 +- drivers/firmware/efi/libstub/x86-stub.c | 24 +++++++++++++++++--- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index 3f9b80726070a8e7..cd9587fcd5084f22 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -137,12 +137,6 @@ SYM_FUNC_START(startup_32) jmp *%eax SYM_FUNC_END(startup_32) -#ifdef CONFIG_EFI_HANDOVER_PROTOCOL -SYM_FUNC_START(efi32_stub_entry) - jmp efi_main -SYM_FUNC_END(efi32_stub_entry) -#endif - .text SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 320e2825ff0b32da..b7599cbbd2ea1136 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -458,7 +458,7 @@ SYM_CODE_END(startup_64) .org 0x390 SYM_FUNC_START(efi64_stub_entry) and $~0xf, %rsp /* realign the stack */ - call efi_main + call efi_handover_entry SYM_FUNC_END(efi64_stub_entry) #endif diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 59076e16c1ac11ee..0528db3e36cf636b 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -891,9 +891,9 @@ static void __noreturn enter_kernel(unsigned long kernel_addr, * On success, we jump to the relocated kernel directly and never return. * On failure, we exit to the firmware via efi_exit instead of returning. */ -asmlinkage void __noreturn efi_main(efi_handle_t handle, - efi_system_table_t *sys_table_arg, - struct boot_params *boot_params) +static void __noreturn efi_main(efi_handle_t handle, + efi_system_table_t *sys_table_arg, + struct boot_params *boot_params) { efi_guid_t guid = EFI_MEMORY_ATTRIBUTE_PROTOCOL_GUID; struct setup_header *hdr = &boot_params->hdr; @@ -1002,3 +1002,21 @@ asmlinkage void __noreturn efi_main(efi_handle_t handle, efi_exit(handle, status); } + +#ifdef CONFIG_EFI_HANDOVER_PROTOCOL +void efi_handover_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg, + struct boot_params *boot_params) +{ + extern char _bss[], _ebss[]; + + /* Ensure that BSS is zeroed when booting via the handover protocol */ + memset(_bss, 0, _ebss - _bss); + efi_main(handle, sys_table_arg, boot_params); +} + +#ifdef CONFIG_X86_32 +extern __alias(efi_handover_entry) +void efi32_stub_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg, + struct boot_params *boot_params); +#endif +#endif From patchwork Mon May 8 07:03:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 680346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E48EC77B75 for ; Mon, 8 May 2023 07:06:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233349AbjEHHGT (ORCPT ); Mon, 8 May 2023 03:06:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233251AbjEHHFo (ORCPT ); Mon, 8 May 2023 03:05:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 447441A130; Mon, 8 May 2023 00:05:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5200561F9A; Mon, 8 May 2023 07:05:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 00AB2C4339E; Mon, 8 May 2023 07:05:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683529504; bh=KZW/tFXMGGOj2MJEPl8J+2P+ukf3gwy/b1cibtTCM2Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uf9hBgxspbXNuKRQ9B4q9XpoSMSLjSdoiefyJe+4fLr53ld0LEmyMir5tgEUB+8HO ihtNDLO99Om+BN0l2EIaNTf28T38A89nuurh4Gq3ZO3PSnmnqo/ZkhanANhpiL6Lv9 VnANpc7xZKswhVu5JOXoEFrli2SWP33nVNvKuJEu4U9fHphQwbNSJroipu138ygFdl TLvsrMCWbbwAtB3gMlal2hivCmn0Gg3d+4q0UTVIBYt0bC9L2EAD/JexgStlSV55P4 TCGcfpc95pAelaMk0qrm9VKUEWJvD0uL/5EDqRU4UAg/NkbHdY+2yz/52L2sKz39LD I0wHIywDBkyfg== From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Evgeniy Baskov , Borislav Petkov , Andy Lutomirski , Dave Hansen , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Alexey Khoroshilov , Peter Jones , Gerd Hoffmann , Dave Young , Mario Limonciello , Kees Cook , Tom Lendacky , "Kirill A . Shutemov" , Linus Torvalds Subject: [PATCH v2 20/20] x86: decompressor: Avoid magic offsets for EFI handover entrypoint Date: Mon, 8 May 2023 09:03:30 +0200 Message-Id: <20230508070330.582131-21-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230508070330.582131-1-ardb@kernel.org> References: <20230508070330.582131-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1804; i=ardb@kernel.org; h=from:subject; bh=KZW/tFXMGGOj2MJEPl8J+2P+ukf3gwy/b1cibtTCM2Y=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JISVi3qGjnOc+6p4w7IvRl16U8PaWTNCHn46utT2aORHdY S+ytyzuKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABORXMXw39s9e6/ke4FMA8+V s61Wbzs+zTLtwNnHa0ULH04/23lGnI/hr0BX3XSPOGfOzRmrq3Y6v3kzfRLr0od2gr6O1vvU25t DWQA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The special EFI handover protocol entrypoint offset wrt to the startup_XX address is described in struct boot_params as handover_offset, so that the special Linux/x86 aware EFI loader can find it there. When mixed mode is enabled, this single field has to describe this offset for both the 32-bit and 64-bit entrypoints, so their respective relative offsets have to be identical. Currently, we use hard-coded fixed offsets to ensure this, but the only requirement is that the entrypoints are 0x200 bytes apart, and this only matters when EFI mixed mode is configured to begin with. So just set the required offset directly. This could potentially result in a build error if the 32-bit startup code is much smaller than the 64-bit code but this is currently far from the case, and easily fixed when that situation does arise. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/head_64.S | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index b7599cbbd2ea1136..72780644a2272af8 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -282,7 +282,6 @@ SYM_FUNC_START(startup_32) SYM_FUNC_END(startup_32) #if IS_ENABLED(CONFIG_EFI_MIXED) && IS_ENABLED(CONFIG_EFI_HANDOVER_PROTOCOL) - .org 0x190 SYM_FUNC_START(efi32_stub_entry) add $0x4, %esp /* Discard return address */ popl %ecx @@ -455,7 +454,9 @@ SYM_CODE_START(startup_64) SYM_CODE_END(startup_64) #ifdef CONFIG_EFI_HANDOVER_PROTOCOL - .org 0x390 +#ifdef CONFIG_EFI_MIXED + .org efi32_stub_entry + 0x200 +#endif SYM_FUNC_START(efi64_stub_entry) and $~0xf, %rsp /* realign the stack */ call efi_handover_entry