mbox series

[v2,00/11] efi: Clean up runtime wrapper and wire it up for PRM

Message ID 20230818113724.368492-1-ardb@kernel.org
Headers show
Series efi: Clean up runtime wrapper and wire it up for PRM | expand

Message

Ard Biesheuvel Aug. 18, 2023, 11:37 a.m. UTC
ACPI PRM uses the EFI runtime services infrastructure, but currently, it
issues the firmware calls directly, instead of going through the
wrappers and handing off the call to the EFI workqueue.

Given that ACPI PRM is used for vendor code rather than core EFI runtime
services, it would be nice if these calls get sandboxed in the same way
as EFI runtime services calls are. This ensures that any faults
occurring in the firmware are handled gracefully and don't bring down
the kernel.

Another reason for using the work queue is the fact that on some
platforms, the EFI memory regions are remapped into the lower address
space, and this means that sampling the instruction pointer in a perf
interrupt may cause confusion about whether the task is running in user
space or in the firmware.

So let's move the ACPI PRM calls into the EFI runtime wrapper
infrastructure. Before that, let's clean it up a bit.

Changes since v2:
- add patches to move EFI runtime setup/teardown sequences out of line
- add patch to deduplicate setup/teardown function calls in the
  workqueue handler
- some whitespace cleanup and added a missing __init
- add RFC patches to drop EFI runtime asm trampoline from x86
- add Rafael's ack to the patch that touches drivers/acpi/

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>

Ard Biesheuvel (11):
  efi/x86: Move EFI runtime call setup/teardown helpers out of line
  efi/arm64: Move EFI runtime call setup/teardown helpers out of line
  efi/riscv: Move EFI runtime call setup/teardown helpers out of line
  efi/runtime-wrappers: Use type safe encapsulation of call arguments
  efi/runtime-wrapper: Move workqueue manipulation out of line
  efi/runtime-wrappers: Remove duplicated macro for service returning
    void
  efi/runtime-wrappers: Don't duplicate setup/teardown code
  acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
  efi/runtime-wrappers: Clean up white space and add __init annotation
  efi/x86: Realign EFI runtime stack
  efi/x86: Rely on compiler to emit MS ABI calls

 arch/arm64/include/asm/efi.h            |  18 +-
 arch/arm64/kernel/efi.c                 |  16 +-
 arch/riscv/include/asm/efi.h            |  10 +-
 arch/x86/Makefile                       |   3 +
 arch/x86/include/asm/efi.h              |  43 +--
 arch/x86/include/asm/uv/bios.h          |   3 +-
 arch/x86/platform/efi/Makefile          |   6 +-
 arch/x86/platform/efi/efi_32.c          |  12 +
 arch/x86/platform/efi/efi_64.c          |  21 +-
 arch/x86/platform/efi/efi_stub_64.S     |  27 --
 arch/x86/platform/uv/Makefile           |   1 +
 arch/x86/platform/uv/bios_uv.c          |   4 +-
 drivers/acpi/Kconfig                    |   2 +-
 drivers/acpi/prmt.c                     |   8 +-
 drivers/firmware/efi/Makefile           |   1 +
 drivers/firmware/efi/riscv-runtime.c    |  15 +-
 drivers/firmware/efi/runtime-wrappers.c | 376 +++++++++++++-------
 include/linux/efi.h                     |  52 ++-
 18 files changed, 353 insertions(+), 265 deletions(-)
 delete mode 100644 arch/x86/platform/efi/efi_stub_64.S

Comments

Nathan Chancellor Aug. 22, 2023, 2:01 a.m. UTC | #1
Hi Ard,

On Fri, Aug 18, 2023 at 01:37:13PM +0200, Ard Biesheuvel wrote:

<snip>

> Ard Biesheuvel (11):
>   efi/x86: Move EFI runtime call setup/teardown helpers out of line
>   efi/arm64: Move EFI runtime call setup/teardown helpers out of line
>   efi/riscv: Move EFI runtime call setup/teardown helpers out of line
>   efi/runtime-wrappers: Use type safe encapsulation of call arguments
>   efi/runtime-wrapper: Move workqueue manipulation out of line
>   efi/runtime-wrappers: Remove duplicated macro for service returning
>     void
>   efi/runtime-wrappers: Don't duplicate setup/teardown code
>   acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
>   efi/runtime-wrappers: Clean up white space and add __init annotation
>   efi/x86: Realign EFI runtime stack
>   efi/x86: Rely on compiler to emit MS ABI calls

I took this series for a spin on my arm64 and x86_64 systems that boot
under EFI. I noticed two issues for x86_64, none for arm64. I was hoping
to have a little more information by this point but I had some personal
stuff to deal with today but I figured I would report this initially
and if you want to continue on the ClangBuiltLinux issue tracker, we
certainly can.

1. I see some kCFI failures with this series on x86_64. The fact that
the target is not a symbol makes me think that a function defined in a
.S file is being called indirectly? The following stacktrace is repeated
over and over on all my machines.

  [ 3520.654794] CFI failure at efi_call_rts+0x314/0x3f0 (target: 0xfffffffeee9db7a4; expected type: 0x747b9986)
  [ 3520.654797] WARNING: CPU: 5 PID: 1870 at efi_call_rts+0x314/0x3f0
  [ 3520.654798] Modules linked in: ...
  [ 3520.654828] CPU: 5 PID: 1870 Comm: kworker/u32:0 Tainted: P                   6.5.0-rc6-next-20230818-debug-11225-g78f833901b02 #1 ab5bcd9a29b4227ebb5977a738a805cb7c6c7fff
  [ 3520.654829] Hardware name: ASUS System Product Name/PRIME Z590M-PLUS, BIOS 1203 10/27/2021
  [ 3520.654829] Workqueue: efi_rts_wq efi_call_rts
  [ 3520.654830] RIP: 0010:efi_call_rts+0x314/0x3f0
  [ 3520.654831] Code: 4e 01 4c 8b 58 48 49 8b 0e 49 8b 56 08 4d 8b 46 10 4d 8b 4e 18 49 8b 46 20 48 89 44 24 20 41 ba 7a 66 84 8b 45 03 53 f1 74 02 <0f> 0b 41 ff d3 0f 1f 00 eb 78 0f b6 3d 13 c3 d5 00 e8 36 6f 54 ff
  [ 3520.654832] RSP: 0018:ffffc90004e37e00 EFLAGS: 00010296
  [ 3520.654832] RAX: ffff88814a618004 RBX: 0000000000000206 RCX: ffff8881017da000
  [ 3520.654833] RDX: ffff8881017da400 RSI: ffffffff94d865c0 RDI: 0000000000000000
  [ 3520.654833] RBP: ffffc90004e37e48 R08: ffffc90004eafe04 R09: ffffc90004eafe08
  [ 3520.654834] R10: 000000008b846759 R11: fffffffeee9db7a4 R12: ffffffff95679990
  [ 3520.654834] R13: ffff888100059000 R14: ffffc90004eafd70 R15: ffff8881024c7800
  [ 3520.654835] FS:  0000000000000000(0000) GS:ffff88883f540000(0000) knlGS:0000000000000000
  [ 3520.654835] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 3520.654836] CR2: 00007f3f5814e008 CR3: 000000010020c001 CR4: 0000000000770ee0
  [ 3520.654836] PKRU: 55555554
  [ 3520.654837] Call Trace:
  [ 3520.654837]  <TASK>
  [ 3520.654837]  ? __warn+0xc8/0x1c0
  [ 3520.654838]  ? efi_call_rts+0x314/0x3f0
  [ 3520.654839]  ? report_cfi_failure+0x4e/0x60
  [ 3520.654841]  ? handle_cfi_failure+0x14c/0x1e0
  [ 3520.654842]  ? handle_bug+0x4f/0x90
  [ 3520.654843]  ? exc_invalid_op+0x1a/0x60
  [ 3520.654844]  ? asm_exc_invalid_op+0x1a/0x20
  [ 3520.654845]  ? efi_call_rts+0x314/0x3f0
  [ 3520.654847]  process_scheduled_works+0x25f/0x470
  [ 3520.654848]  worker_thread+0x21c/0x2e0
  [ 3520.654849]  ? __cfi_worker_thread+0x10/0x10
  [ 3520.654851]  kthread+0xf6/0x110
  [ 3520.654852]  ? __cfi_kthread+0x10/0x10
  [ 3520.654853]  ret_from_fork+0x45/0x60
  [ 3520.654854]  ? __cfi_kthread+0x10/0x10
  [ 3520.654856]  ret_from_fork_asm+0x1b/0x30
  [ 3520.654858]  </TASK>
  [ 3520.654858] ---[ end trace 0000000000000000 ]---

2. I see a link failure when building with LTO, presumably due to the
final two patches of the series. I see it with my distribution's
configuration [1] and allmodconfig with CONFIG_LTO_CLANG_THIN=y, I did
not try full LTO. Unfortunately, the message does not seem to make it
obvious what is going on here, I will try to debug this more unless you
beat me to it.

  ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi.o at 1059858)' and 'vmlinux.a(efi_64.o at 1059918)'
  ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi_64.o at 1059918)' and 'vmlinux.a(efi.o at 1059858)'
  ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(runtime-wrappers.o at 1181838)' and 'vmlinux.a(quirks.o at 1059798)'
  ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi.o at 1181178)' and 'vmlinux.a(runtime-wrappers.o at 1181838)'
  ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi_64.o at 1059918)' and 'vmlinux.a(setup.o at 1048218)'

[1]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/raw/main/config

Cheers,
Nathan
Ard Biesheuvel Aug. 22, 2023, 7:53 a.m. UTC | #2
On Tue, 22 Aug 2023 at 04:01, Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Ard,
>
> On Fri, Aug 18, 2023 at 01:37:13PM +0200, Ard Biesheuvel wrote:
>
> <snip>
>
> > Ard Biesheuvel (11):
> >   efi/x86: Move EFI runtime call setup/teardown helpers out of line
> >   efi/arm64: Move EFI runtime call setup/teardown helpers out of line
> >   efi/riscv: Move EFI runtime call setup/teardown helpers out of line
> >   efi/runtime-wrappers: Use type safe encapsulation of call arguments
> >   efi/runtime-wrapper: Move workqueue manipulation out of line
> >   efi/runtime-wrappers: Remove duplicated macro for service returning
> >     void
> >   efi/runtime-wrappers: Don't duplicate setup/teardown code
> >   acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
> >   efi/runtime-wrappers: Clean up white space and add __init annotation
> >   efi/x86: Realign EFI runtime stack
> >   efi/x86: Rely on compiler to emit MS ABI calls
>
> I took this series for a spin on my arm64 and x86_64 systems that boot
> under EFI. I noticed two issues for x86_64, none for arm64. I was hoping
> to have a little more information by this point but I had some personal
> stuff to deal with today but I figured I would report this initially
> and if you want to continue on the ClangBuiltLinux issue tracker, we
> certainly can.
>
> 1. I see some kCFI failures with this series on x86_64. The fact that
> the target is not a symbol makes me think that a function defined in a
> .S file is being called indirectly? The following stacktrace is repeated
> over and over on all my machines.
>

This has to do with the indirect calls being made by the EFI code to
the firmware services, which are not part of the kernel build.

Before, those indirect calls were hidden from the compiler, as they
were made from assembler, but now they are generated by the compiler,
and so we have to inform it that those functions do not have kCFI
metadata.

The below should have fixed it, but I am getting lots of build errors
along the lines of

error: '__no_sanitize__' attribute only applies to functions,
Objective-C methods, and global variables

when I add this. Suggestions welcome on how to inform the compiler
that calls via those function pointers should have __nocfi semantics.

--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -51,7 +51,7 @@ typedef u64 efi_physical_addr_t;
 typedef void *efi_handle_t;

 #if defined(CONFIG_X86_64)
-#define __efiapi __attribute__((ms_abi))
+#define __efiapi __attribute__((ms_abi)) __nocfi
 #elif defined(CONFIG_X86_32)
 #define __efiapi __attribute__((regparm(0)))
 #else

>   [ 3520.654794] CFI failure at efi_call_rts+0x314/0x3f0 (target: 0xfffffffeee9db7a4; expected type: 0x747b9986)
>   [ 3520.654797] WARNING: CPU: 5 PID: 1870 at efi_call_rts+0x314/0x3f0
>   [ 3520.654798] Modules linked in: ...
>   [ 3520.654828] CPU: 5 PID: 1870 Comm: kworker/u32:0 Tainted: P                   6.5.0-rc6-next-20230818-debug-11225-g78f833901b02 #1 ab5bcd9a29b4227ebb5977a738a805cb7c6c7fff
>   [ 3520.654829] Hardware name: ASUS System Product Name/PRIME Z590M-PLUS, BIOS 1203 10/27/2021
>   [ 3520.654829] Workqueue: efi_rts_wq efi_call_rts
>   [ 3520.654830] RIP: 0010:efi_call_rts+0x314/0x3f0
>   [ 3520.654831] Code: 4e 01 4c 8b 58 48 49 8b 0e 49 8b 56 08 4d 8b 46 10 4d 8b 4e 18 49 8b 46 20 48 89 44 24 20 41 ba 7a 66 84 8b 45 03 53 f1 74 02 <0f> 0b 41 ff d3 0f 1f 00 eb 78 0f b6 3d 13 c3 d5 00 e8 36 6f 54 ff
>   [ 3520.654832] RSP: 0018:ffffc90004e37e00 EFLAGS: 00010296
>   [ 3520.654832] RAX: ffff88814a618004 RBX: 0000000000000206 RCX: ffff8881017da000
>   [ 3520.654833] RDX: ffff8881017da400 RSI: ffffffff94d865c0 RDI: 0000000000000000
>   [ 3520.654833] RBP: ffffc90004e37e48 R08: ffffc90004eafe04 R09: ffffc90004eafe08
>   [ 3520.654834] R10: 000000008b846759 R11: fffffffeee9db7a4 R12: ffffffff95679990
>   [ 3520.654834] R13: ffff888100059000 R14: ffffc90004eafd70 R15: ffff8881024c7800
>   [ 3520.654835] FS:  0000000000000000(0000) GS:ffff88883f540000(0000) knlGS:0000000000000000
>   [ 3520.654835] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [ 3520.654836] CR2: 00007f3f5814e008 CR3: 000000010020c001 CR4: 0000000000770ee0
>   [ 3520.654836] PKRU: 55555554
>   [ 3520.654837] Call Trace:
>   [ 3520.654837]  <TASK>
>   [ 3520.654837]  ? __warn+0xc8/0x1c0
>   [ 3520.654838]  ? efi_call_rts+0x314/0x3f0
>   [ 3520.654839]  ? report_cfi_failure+0x4e/0x60
>   [ 3520.654841]  ? handle_cfi_failure+0x14c/0x1e0
>   [ 3520.654842]  ? handle_bug+0x4f/0x90
>   [ 3520.654843]  ? exc_invalid_op+0x1a/0x60
>   [ 3520.654844]  ? asm_exc_invalid_op+0x1a/0x20
>   [ 3520.654845]  ? efi_call_rts+0x314/0x3f0
>   [ 3520.654847]  process_scheduled_works+0x25f/0x470
>   [ 3520.654848]  worker_thread+0x21c/0x2e0
>   [ 3520.654849]  ? __cfi_worker_thread+0x10/0x10
>   [ 3520.654851]  kthread+0xf6/0x110
>   [ 3520.654852]  ? __cfi_kthread+0x10/0x10
>   [ 3520.654853]  ret_from_fork+0x45/0x60
>   [ 3520.654854]  ? __cfi_kthread+0x10/0x10
>   [ 3520.654856]  ret_from_fork_asm+0x1b/0x30
>   [ 3520.654858]  </TASK>
>   [ 3520.654858] ---[ end trace 0000000000000000 ]---
>
> 2. I see a link failure when building with LTO, presumably due to the
> final two patches of the series. I see it with my distribution's
> configuration [1] and allmodconfig with CONFIG_LTO_CLANG_THIN=y, I did
> not try full LTO. Unfortunately, the message does not seem to make it
> obvious what is going on here, I will try to debug this more unless you
> beat me to it.
>
>   ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi.o at 1059858)' and 'vmlinux.a(efi_64.o at 1059918)'
>   ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi_64.o at 1059918)' and 'vmlinux.a(efi.o at 1059858)'
>   ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(runtime-wrappers.o at 1181838)' and 'vmlinux.a(quirks.o at 1059798)'
>   ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi.o at 1181178)' and 'vmlinux.a(runtime-wrappers.o at 1181838)'
>   ld.lld: error: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in 'vmlinux.a(efi_64.o at 1059918)' and 'vmlinux.a(setup.o at 1048218)'
>
> [1]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/raw/main/config
>

OK, this is probably a nasty one. The x86 kernel uses 8 byte stack
alignment only, instead of the 16 bytes mandated by the calling
convention. The reason is that function calls made from inline asm
will misalign the stack without the compiler knowing about this, and
so the only way to deal with this is to simply use 8 byte alignment
throughout.

EFI runtime services require 16 byte stack alignment, and so the call
needs to originate from a compilation unit that a) uses 16 byte stack
alignment, and b) realigns the stack if its routines are invoked with
a stack that is misaligned. This ensures that the outgoing stack
alignment when actually entering the firmware is correct.

Apparently, LTO does not like it if you use different stack alignment
setting for different object files, which is understandable. I don't
see any way around this, so I will probably have to drop the last two
patches.
Ard Biesheuvel Aug. 22, 2023, 10 a.m. UTC | #3
On Tue, 22 Aug 2023 at 09:53, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 22 Aug 2023 at 04:01, Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > Hi Ard,
> >
> > On Fri, Aug 18, 2023 at 01:37:13PM +0200, Ard Biesheuvel wrote:
> >
> > <snip>
> >
> > > Ard Biesheuvel (11):
> > >   efi/x86: Move EFI runtime call setup/teardown helpers out of line
> > >   efi/arm64: Move EFI runtime call setup/teardown helpers out of line
> > >   efi/riscv: Move EFI runtime call setup/teardown helpers out of line
> > >   efi/runtime-wrappers: Use type safe encapsulation of call arguments
> > >   efi/runtime-wrapper: Move workqueue manipulation out of line
> > >   efi/runtime-wrappers: Remove duplicated macro for service returning
> > >     void
> > >   efi/runtime-wrappers: Don't duplicate setup/teardown code
> > >   acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
> > >   efi/runtime-wrappers: Clean up white space and add __init annotation
> > >   efi/x86: Realign EFI runtime stack
> > >   efi/x86: Rely on compiler to emit MS ABI calls
> >
> > I took this series for a spin on my arm64 and x86_64 systems that boot
> > under EFI. I noticed two issues for x86_64, none for arm64. I was hoping
> > to have a little more information by this point but I had some personal
> > stuff to deal with today but I figured I would report this initially
> > and if you want to continue on the ClangBuiltLinux issue tracker, we
> > certainly can.
> >
> > 1. I see some kCFI failures with this series on x86_64. The fact that
> > the target is not a symbol makes me think that a function defined in a
> > .S file is being called indirectly? The following stacktrace is repeated
> > over and over on all my machines.
> >
>
> This has to do with the indirect calls being made by the EFI code to
> the firmware services, which are not part of the kernel build.
>
> Before, those indirect calls were hidden from the compiler, as they
> were made from assembler, but now they are generated by the compiler,
> and so we have to inform it that those functions do not have kCFI
> metadata.
>
> The below should have fixed it, but I am getting lots of build errors
> along the lines of
>
> error: '__no_sanitize__' attribute only applies to functions,
> Objective-C methods, and global variables
>
> when I add this. Suggestions welcome on how to inform the compiler
> that calls via those function pointers should have __nocfi semantics.
>

OK, so it is a property of the caller, not the callee. So it is simply
a matter of marking every user of the EFI runtime services with
__nocfi. So basically, every routine marked __efi_realign_stack should
be marked __nocfi as well.