diff mbox series

[11/12] efi/loongarch: libstub: remove dependency on flattened DT

Message ID 20220918213544.2176249-12-ardb@kernel.org
State Superseded
Headers show
Series efi: disentangle the generic EFI stub from FDT | expand

Commit Message

Ard Biesheuvel Sept. 18, 2022, 9:35 p.m. UTC
LoongArch does not use FDT or DT natively [yet], and the only reason it
currently uses it is so that it can reuse the existing EFI stub code.

Overloading the DT with data passed between the EFI stub and the core
kernel has been a source of problems: there is the overlap between
information provided by EFI which DT can also provide (initrd base/size,
command line, memory descriptions), requiring us to reason about which
is which and what to prioritize. It has also resulted in ABI leaks,
i.e., internal ABI being promoted to external ABI inadvertently because
the bootloader can set the EFI stub's DT properties as well (e.g.,
"kaslr-seed"). This has become especially problematic with boot
environments that want to pretend that EFI boot is being done (to access
ACPI and SMBIOS tables, for instance) but have no ability to execute the
EFI stub, and so the environment that the EFI stub creates is emulated
[poorly, in some cases].

Another downside of treating DT like this is that the DT binary that the
kernel receives is different from the one created by the firmware, which
is undesirable in the context of secure and measured boot.

Given that LoongArch support in Linux is brand new, we can avoid these
pitfalls, and treat the DT strictly as a hardware description, and use a
separate handover method between the EFI stub and the kernel. Now that
initrd loading and passing the EFI memory map have been refactored into
pure EFI routines that use EFI configuration tables, the only thing we
need to pass directly is the kernel command line (even if we could pass
this via a config table as well, it is used extremely early, so passing
it directly is preferred in this case.)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/loongarch/Kconfig                        |  3 --
 arch/loongarch/include/asm/bootinfo.h         |  2 +-
 arch/loongarch/kernel/efi.c                   | 30 ++++++++++-
 arch/loongarch/kernel/env.c                   | 22 ++++----
 arch/loongarch/kernel/head.S                  |  2 +
 arch/loongarch/kernel/setup.c                 |  4 +-
 drivers/firmware/efi/libstub/Makefile         | 13 +++--
 drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++---
 8 files changed, 100 insertions(+), 32 deletions(-)

Comments

Huacai Chen Sept. 19, 2022, 1:58 a.m. UTC | #1
Hi, Ard,

I think the parameters passed to the core kernel need to be discussed.
The old way (so-called old world):
a0=argc, a1=argv, a1=bpi

The current way (so-called new world):
a0=efi boot flag, a1=fdt pointer

The new way (in this patchset):
a0=efi boot flag, a1=systemtable, a2=cmdline

I prefer to use the current way, there are 3 reasons:
1, both acpi system and dt system can use the same parameters passing method;
2, arm64, riscv and loongarch can use similar logics (light FDT);
3, out-of-tree patches can make compatibility with the old world
easier by just judging whether a2 is zero.

Huacai

On Mon, Sep 19, 2022 at 5:36 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> LoongArch does not use FDT or DT natively [yet], and the only reason it
> currently uses it is so that it can reuse the existing EFI stub code.
>
> Overloading the DT with data passed between the EFI stub and the core
> kernel has been a source of problems: there is the overlap between
> information provided by EFI which DT can also provide (initrd base/size,
> command line, memory descriptions), requiring us to reason about which
> is which and what to prioritize. It has also resulted in ABI leaks,
> i.e., internal ABI being promoted to external ABI inadvertently because
> the bootloader can set the EFI stub's DT properties as well (e.g.,
> "kaslr-seed"). This has become especially problematic with boot
> environments that want to pretend that EFI boot is being done (to access
> ACPI and SMBIOS tables, for instance) but have no ability to execute the
> EFI stub, and so the environment that the EFI stub creates is emulated
> [poorly, in some cases].
>
> Another downside of treating DT like this is that the DT binary that the
> kernel receives is different from the one created by the firmware, which
> is undesirable in the context of secure and measured boot.
>
> Given that LoongArch support in Linux is brand new, we can avoid these
> pitfalls, and treat the DT strictly as a hardware description, and use a
> separate handover method between the EFI stub and the kernel. Now that
> initrd loading and passing the EFI memory map have been refactored into
> pure EFI routines that use EFI configuration tables, the only thing we
> need to pass directly is the kernel command line (even if we could pass
> this via a config table as well, it is used extremely early, so passing
> it directly is preferred in this case.)
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/loongarch/Kconfig                        |  3 --
>  arch/loongarch/include/asm/bootinfo.h         |  2 +-
>  arch/loongarch/kernel/efi.c                   | 30 ++++++++++-
>  arch/loongarch/kernel/env.c                   | 22 ++++----
>  arch/loongarch/kernel/head.S                  |  2 +
>  arch/loongarch/kernel/setup.c                 |  4 +-
>  drivers/firmware/efi/libstub/Makefile         | 13 +++--
>  drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++---
>  8 files changed, 100 insertions(+), 32 deletions(-)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index fca106a8b8af..14a2a1ec8561 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -104,8 +104,6 @@ config LOONGARCH
>         select MODULES_USE_ELF_RELA if MODULES
>         select NEED_PER_CPU_EMBED_FIRST_CHUNK
>         select NEED_PER_CPU_PAGE_FIRST_CHUNK
> -       select OF
> -       select OF_EARLY_FLATTREE
>         select PCI
>         select PCI_DOMAINS_GENERIC
>         select PCI_ECAM if ACPI
> @@ -311,7 +309,6 @@ config DMI
>  config EFI
>         bool "EFI runtime service support"
>         select UCS2_STRING
> -       select EFI_PARAMS_FROM_FDT
>         select EFI_RUNTIME_WRAPPERS
>         help
>           This enables the kernel to use EFI runtime services that are
> diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h
> index e02ac4af7f6e..8e5881bc5ad1 100644
> --- a/arch/loongarch/include/asm/bootinfo.h
> +++ b/arch/loongarch/include/asm/bootinfo.h
> @@ -36,7 +36,7 @@ struct loongson_system_configuration {
>  };
>
>  extern u64 efi_system_table;
> -extern unsigned long fw_arg0, fw_arg1;
> +extern unsigned long fw_arg0, fw_arg1, fw_arg2;
>  extern struct loongson_board_info b_info;
>  extern struct loongson_system_configuration loongson_sysconf;
>
> diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
> index 1f1f755fb425..3b80675726ec 100644
> --- a/arch/loongarch/kernel/efi.c
> +++ b/arch/loongarch/kernel/efi.c
> @@ -27,8 +27,13 @@
>  static unsigned long efi_nr_tables;
>  static unsigned long efi_config_table;
>
> +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR;
> +
>  static efi_system_table_t *efi_systab;
> -static efi_config_table_type_t arch_tables[] __initdata = {{},};
> +static efi_config_table_type_t arch_tables[] __initdata = {
> +       {LINUX_EFI_BOOT_MEMMAP_GUID,    &boot_memmap,   "MEMMAP" },
> +       {},
> +};
>
>  void __init efi_runtime_init(void)
>  {
> @@ -61,6 +66,8 @@ void __init efi_init(void)
>                 return;
>         }
>
> +       efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor);
> +
>         set_bit(EFI_64BIT, &efi.flags);
>         efi_nr_tables    = efi_systab->nr_tables;
>         efi_config_table = (unsigned long)efi_systab->tables;
> @@ -72,4 +79,25 @@ void __init efi_init(void)
>
>         if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI)
>                 memblock_reserve(screen_info.lfb_base, screen_info.lfb_size);
> +
> +       if (boot_memmap != EFI_INVALID_TABLE_ADDR) {
> +               struct efi_memory_map_data data;
> +               struct efi_boot_memmap *tbl;
> +
> +               tbl = early_memremap_ro(boot_memmap, sizeof(*tbl));
> +               if (tbl) {
> +                       data.phys_map           = boot_memmap + sizeof(*tbl);
> +                       data.size               = tbl->map_size;
> +                       data.desc_size          = tbl->desc_size;
> +                       data.desc_version       = tbl->desc_ver;
> +
> +                       if (efi_memmap_init_early(&data) < 0)
> +                               panic("Unable to map EFI memory map.\n");
> +
> +                       memblock_reserve(data.phys_map & PAGE_MASK,
> +                                        PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
> +
> +                       early_memunmap(tbl, sizeof(*tbl));
> +               }
> +       }
>  }
> diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c
> index 82b478a5c665..05c38d28476e 100644
> --- a/arch/loongarch/kernel/env.c
> +++ b/arch/loongarch/kernel/env.c
> @@ -8,7 +8,6 @@
>  #include <linux/efi.h>
>  #include <linux/export.h>
>  #include <linux/memblock.h>
> -#include <linux/of_fdt.h>
>  #include <asm/early_ioremap.h>
>  #include <asm/bootinfo.h>
>  #include <asm/loongson.h>
> @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf);
>  void __init init_environ(void)
>  {
>         int efi_boot = fw_arg0;
> -       struct efi_memory_map_data data;
> -       void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
>
> -       if (efi_boot)
> -               set_bit(EFI_BOOT, &efi.flags);
> -       else
> -               clear_bit(EFI_BOOT, &efi.flags);
> +       if (efi_boot) {
> +               char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE);
>
> -       early_init_dt_scan(fdt_ptr);
> -       early_init_fdt_reserve_self();
> -       efi_system_table = efi_get_fdt_params(&data);
> +               strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE);
> +               early_memunmap(cmdline, COMMAND_LINE_SIZE);
>
> -       efi_memmap_init_early(&data);
> -       memblock_reserve(data.phys_map & PAGE_MASK,
> -                        PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
> +               efi_system_table = fw_arg1;
> +               set_bit(EFI_BOOT, &efi.flags);
> +       } else {
> +               clear_bit(EFI_BOOT, &efi.flags);
> +       }
>  }
>
>  static int __init init_cpu_fullname(void)
> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> index 01bac62a6442..8f89f39fd31b 100644
> --- a/arch/loongarch/kernel/head.S
> +++ b/arch/loongarch/kernel/head.S
> @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry)                  # kernel entry point
>         st.d            a0, t0, 0               # firmware arguments
>         la              t0, fw_arg1
>         st.d            a1, t0, 0
> +       la              t0, fw_arg2
> +       st.d            a2, t0, 0
>
>         /* KSave3 used for percpu base, initialized as 0 */
>         csrwr           zero, PERCPU_BASE_KS
> diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> index e8714b1d94c8..7fabf2306e80 100644
> --- a/arch/loongarch/kernel/setup.c
> +++ b/arch/loongarch/kernel/setup.c
> @@ -51,7 +51,7 @@
>
>  struct screen_info screen_info __section(".data");
>
> -unsigned long fw_arg0, fw_arg1;
> +unsigned long fw_arg0, fw_arg1, fw_arg2;
>  DEFINE_PER_CPU(unsigned long, kernelsp);
>  struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly;
>
> @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem);
>
>  void __init platform_init(void)
>  {
> -       efi_init();
>  #ifdef CONFIG_ACPI_TABLE_UPGRADE
>         acpi_table_upgrade();
>  #endif
> @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p)
>         *cmdline_p = boot_command_line;
>
>         init_environ();
> +       efi_init();
>         memblock_init();
>         parse_early_param();
>
> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> index 0dbc6d93f2e6..d8d6657e6277 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>  cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>                                    -fpie
>
> -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)   += -I$(srctree)/scripts/dtc/libfdt
>
>  KBUILD_CFLAGS                  := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
>                                    -include $(srctree)/include/linux/hidden.h \
> @@ -60,14 +60,17 @@ lib-y                               := efi-stub-helper.o gop.o secureboot.o tpm.o \
>                                    alignedmem.o relocate.o vsprintf.o \
>                                    systable.o
>
> -# include the stub's generic dependencies from lib/ when building for ARM/arm64
> -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> +# include the stub's libfdt dependencies from lib/ when needed
> +libfdt-deps                    := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \
> +                                  fdt_empty_tree.c fdt_sw.c
> +
> +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \
> +                                    $(patsubst %.c,lib-%.o,$(libfdt-deps))
>
>  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
>         $(call if_changed_rule,cc_o_c)
>
> -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \
> -                                  $(patsubst %.c,lib-%.o,$(efi-deps-y))
> +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o
>
>  lib-$(CONFIG_ARM)              += arm32-stub.o
>  lib-$(CONFIG_ARM64)            += arm64-stub.o
> diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> index b7ef8d2df59e..7c684d10f936 100644
> --- a/drivers/firmware/efi/libstub/loongarch-stub.c
> +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> @@ -9,7 +9,8 @@
>  #include <asm/addrspace.h>
>  #include "efistub.h"
>
> -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt);
> +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab,
> +                                         unsigned long cmdline);
>
>  extern int kernel_asize;
>  extern int kernel_fsize;
> @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
>         return status;
>  }
>
> -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size)
> +struct exit_boot_struct {
> +       efi_memory_desc_t       *runtime_map;
> +       int                     runtime_entry_count;
> +};
> +
> +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> +{
> +       struct exit_boot_struct *p = priv;
> +
> +       /*
> +        * Update the memory map with virtual addresses. The function will also
> +        * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME
> +        * entries so that we can pass it straight to SetVirtualAddressMap()
> +        */
> +       efi_get_virtmap(map->map, map->map_size, map->desc_size,
> +                       p->runtime_map, &p->runtime_entry_count);
> +
> +       return EFI_SUCCESS;
> +}
> +
> +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image,
> +                            unsigned long image_addr, char *cmdline_ptr)
>  {
>         kernel_entry_t real_kernel_entry;
> +       struct exit_boot_struct priv;
> +       unsigned long desc_size;
> +       efi_status_t status;
> +       u32 desc_ver;
> +
> +       status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Unable to retrieve UEFI memory map.\n");
> +               return status;
> +       }
> +
> +       efi_info("Exiting boot services\n");
> +
> +       efi_novamap = false;
> +       status = efi_exit_boot_services(handle, &priv, exit_boot_func);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       /* Install the new virtual address map */
> +       efi_rt_call(set_virtual_address_map,
> +                   priv.runtime_entry_count * desc_size, desc_size,
> +                   desc_ver, priv.runtime_map);
>
>         /* Config Direct Mapping */
>         csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
>         csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
>
>         real_kernel_entry = (kernel_entry_t)
> -               ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS);
> +               ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS);
>
> -       if (!efi_novamap)
> -               real_kernel_entry(true, fdt);
> -       else
> -               real_kernel_entry(false, fdt);
> +       real_kernel_entry(true, (unsigned long)efi_system_table,
> +                         (unsigned long)cmdline_ptr);
>  }
> --
> 2.35.1
>
>
Ard Biesheuvel Sept. 19, 2022, 5:15 a.m. UTC | #2
On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Ard,
>
> I think the parameters passed to the core kernel need to be discussed.
> The old way (so-called old world):
> a0=argc, a1=argv, a1=bpi
>
> The current way (so-called new world):
> a0=efi boot flag, a1=fdt pointer
>
> The new way (in this patchset):
> a0=efi boot flag, a1=systemtable, a2=cmdline
>
> I prefer to use the current way, there are 3 reasons:
> 1, both acpi system and dt system can use the same parameters passing method;

DT systems will use this too. The distinction is between EFI boot and
non-EFI boot. We *really* should keep these separate, given the
experience on ARM, where other projects invent ways to pass those
values to the kernel without going through the stub.

> 2, arm64, riscv and loongarch can use similar logics (light FDT);

No need to repeat a mistake. I intend to fix RISC-V next.

> 3, out-of-tree patches can make compatibility with the old world
> easier by just judging whether a2 is zero.
>

The whole point of this series is that the EFI stub should not be
touching the DT at all. In other words, there is no DT pointer, so the
current method needs to be revised.

What we might do is

a0=systemtable, a1=cmdline

as any non-zero value is treated as logic true. That way, your logic
to test a2 is zero will still work.


>
> On Mon, Sep 19, 2022 at 5:36 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > LoongArch does not use FDT or DT natively [yet], and the only reason it
> > currently uses it is so that it can reuse the existing EFI stub code.
> >
> > Overloading the DT with data passed between the EFI stub and the core
> > kernel has been a source of problems: there is the overlap between
> > information provided by EFI which DT can also provide (initrd base/size,
> > command line, memory descriptions), requiring us to reason about which
> > is which and what to prioritize. It has also resulted in ABI leaks,
> > i.e., internal ABI being promoted to external ABI inadvertently because
> > the bootloader can set the EFI stub's DT properties as well (e.g.,
> > "kaslr-seed"). This has become especially problematic with boot
> > environments that want to pretend that EFI boot is being done (to access
> > ACPI and SMBIOS tables, for instance) but have no ability to execute the
> > EFI stub, and so the environment that the EFI stub creates is emulated
> > [poorly, in some cases].
> >
> > Another downside of treating DT like this is that the DT binary that the
> > kernel receives is different from the one created by the firmware, which
> > is undesirable in the context of secure and measured boot.
> >
> > Given that LoongArch support in Linux is brand new, we can avoid these
> > pitfalls, and treat the DT strictly as a hardware description, and use a
> > separate handover method between the EFI stub and the kernel. Now that
> > initrd loading and passing the EFI memory map have been refactored into
> > pure EFI routines that use EFI configuration tables, the only thing we
> > need to pass directly is the kernel command line (even if we could pass
> > this via a config table as well, it is used extremely early, so passing
> > it directly is preferred in this case.)
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/loongarch/Kconfig                        |  3 --
> >  arch/loongarch/include/asm/bootinfo.h         |  2 +-
> >  arch/loongarch/kernel/efi.c                   | 30 ++++++++++-
> >  arch/loongarch/kernel/env.c                   | 22 ++++----
> >  arch/loongarch/kernel/head.S                  |  2 +
> >  arch/loongarch/kernel/setup.c                 |  4 +-
> >  drivers/firmware/efi/libstub/Makefile         | 13 +++--
> >  drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++---
> >  8 files changed, 100 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index fca106a8b8af..14a2a1ec8561 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -104,8 +104,6 @@ config LOONGARCH
> >         select MODULES_USE_ELF_RELA if MODULES
> >         select NEED_PER_CPU_EMBED_FIRST_CHUNK
> >         select NEED_PER_CPU_PAGE_FIRST_CHUNK
> > -       select OF
> > -       select OF_EARLY_FLATTREE
> >         select PCI
> >         select PCI_DOMAINS_GENERIC
> >         select PCI_ECAM if ACPI
> > @@ -311,7 +309,6 @@ config DMI
> >  config EFI
> >         bool "EFI runtime service support"
> >         select UCS2_STRING
> > -       select EFI_PARAMS_FROM_FDT
> >         select EFI_RUNTIME_WRAPPERS
> >         help
> >           This enables the kernel to use EFI runtime services that are
> > diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h
> > index e02ac4af7f6e..8e5881bc5ad1 100644
> > --- a/arch/loongarch/include/asm/bootinfo.h
> > +++ b/arch/loongarch/include/asm/bootinfo.h
> > @@ -36,7 +36,7 @@ struct loongson_system_configuration {
> >  };
> >
> >  extern u64 efi_system_table;
> > -extern unsigned long fw_arg0, fw_arg1;
> > +extern unsigned long fw_arg0, fw_arg1, fw_arg2;
> >  extern struct loongson_board_info b_info;
> >  extern struct loongson_system_configuration loongson_sysconf;
> >
> > diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
> > index 1f1f755fb425..3b80675726ec 100644
> > --- a/arch/loongarch/kernel/efi.c
> > +++ b/arch/loongarch/kernel/efi.c
> > @@ -27,8 +27,13 @@
> >  static unsigned long efi_nr_tables;
> >  static unsigned long efi_config_table;
> >
> > +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR;
> > +
> >  static efi_system_table_t *efi_systab;
> > -static efi_config_table_type_t arch_tables[] __initdata = {{},};
> > +static efi_config_table_type_t arch_tables[] __initdata = {
> > +       {LINUX_EFI_BOOT_MEMMAP_GUID,    &boot_memmap,   "MEMMAP" },
> > +       {},
> > +};
> >
> >  void __init efi_runtime_init(void)
> >  {
> > @@ -61,6 +66,8 @@ void __init efi_init(void)
> >                 return;
> >         }
> >
> > +       efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor);
> > +
> >         set_bit(EFI_64BIT, &efi.flags);
> >         efi_nr_tables    = efi_systab->nr_tables;
> >         efi_config_table = (unsigned long)efi_systab->tables;
> > @@ -72,4 +79,25 @@ void __init efi_init(void)
> >
> >         if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI)
> >                 memblock_reserve(screen_info.lfb_base, screen_info.lfb_size);
> > +
> > +       if (boot_memmap != EFI_INVALID_TABLE_ADDR) {
> > +               struct efi_memory_map_data data;
> > +               struct efi_boot_memmap *tbl;
> > +
> > +               tbl = early_memremap_ro(boot_memmap, sizeof(*tbl));
> > +               if (tbl) {
> > +                       data.phys_map           = boot_memmap + sizeof(*tbl);
> > +                       data.size               = tbl->map_size;
> > +                       data.desc_size          = tbl->desc_size;
> > +                       data.desc_version       = tbl->desc_ver;
> > +
> > +                       if (efi_memmap_init_early(&data) < 0)
> > +                               panic("Unable to map EFI memory map.\n");
> > +
> > +                       memblock_reserve(data.phys_map & PAGE_MASK,
> > +                                        PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
> > +
> > +                       early_memunmap(tbl, sizeof(*tbl));
> > +               }
> > +       }
> >  }
> > diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c
> > index 82b478a5c665..05c38d28476e 100644
> > --- a/arch/loongarch/kernel/env.c
> > +++ b/arch/loongarch/kernel/env.c
> > @@ -8,7 +8,6 @@
> >  #include <linux/efi.h>
> >  #include <linux/export.h>
> >  #include <linux/memblock.h>
> > -#include <linux/of_fdt.h>
> >  #include <asm/early_ioremap.h>
> >  #include <asm/bootinfo.h>
> >  #include <asm/loongson.h>
> > @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf);
> >  void __init init_environ(void)
> >  {
> >         int efi_boot = fw_arg0;
> > -       struct efi_memory_map_data data;
> > -       void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> >
> > -       if (efi_boot)
> > -               set_bit(EFI_BOOT, &efi.flags);
> > -       else
> > -               clear_bit(EFI_BOOT, &efi.flags);
> > +       if (efi_boot) {
> > +               char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE);
> >
> > -       early_init_dt_scan(fdt_ptr);
> > -       early_init_fdt_reserve_self();
> > -       efi_system_table = efi_get_fdt_params(&data);
> > +               strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE);
> > +               early_memunmap(cmdline, COMMAND_LINE_SIZE);
> >
> > -       efi_memmap_init_early(&data);
> > -       memblock_reserve(data.phys_map & PAGE_MASK,
> > -                        PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
> > +               efi_system_table = fw_arg1;
> > +               set_bit(EFI_BOOT, &efi.flags);
> > +       } else {
> > +               clear_bit(EFI_BOOT, &efi.flags);
> > +       }
> >  }
> >
> >  static int __init init_cpu_fullname(void)
> > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> > index 01bac62a6442..8f89f39fd31b 100644
> > --- a/arch/loongarch/kernel/head.S
> > +++ b/arch/loongarch/kernel/head.S
> > @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry)                  # kernel entry point
> >         st.d            a0, t0, 0               # firmware arguments
> >         la              t0, fw_arg1
> >         st.d            a1, t0, 0
> > +       la              t0, fw_arg2
> > +       st.d            a2, t0, 0
> >
> >         /* KSave3 used for percpu base, initialized as 0 */
> >         csrwr           zero, PERCPU_BASE_KS
> > diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> > index e8714b1d94c8..7fabf2306e80 100644
> > --- a/arch/loongarch/kernel/setup.c
> > +++ b/arch/loongarch/kernel/setup.c
> > @@ -51,7 +51,7 @@
> >
> >  struct screen_info screen_info __section(".data");
> >
> > -unsigned long fw_arg0, fw_arg1;
> > +unsigned long fw_arg0, fw_arg1, fw_arg2;
> >  DEFINE_PER_CPU(unsigned long, kernelsp);
> >  struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly;
> >
> > @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem);
> >
> >  void __init platform_init(void)
> >  {
> > -       efi_init();
> >  #ifdef CONFIG_ACPI_TABLE_UPGRADE
> >         acpi_table_upgrade();
> >  #endif
> > @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p)
> >         *cmdline_p = boot_command_line;
> >
> >         init_environ();
> > +       efi_init();
> >         memblock_init();
> >         parse_early_param();
> >
> > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > index 0dbc6d93f2e6..d8d6657e6277 100644
> > --- a/drivers/firmware/efi/libstub/Makefile
> > +++ b/drivers/firmware/efi/libstub/Makefile
> > @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >  cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                    -fpie
> >
> > -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)   += -I$(srctree)/scripts/dtc/libfdt
> >
> >  KBUILD_CFLAGS                  := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
> >                                    -include $(srctree)/include/linux/hidden.h \
> > @@ -60,14 +60,17 @@ lib-y                               := efi-stub-helper.o gop.o secureboot.o tpm.o \
> >                                    alignedmem.o relocate.o vsprintf.o \
> >                                    systable.o
> >
> > -# include the stub's generic dependencies from lib/ when building for ARM/arm64
> > -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > +# include the stub's libfdt dependencies from lib/ when needed
> > +libfdt-deps                    := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \
> > +                                  fdt_empty_tree.c fdt_sw.c
> > +
> > +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \
> > +                                    $(patsubst %.c,lib-%.o,$(libfdt-deps))
> >
> >  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
> >         $(call if_changed_rule,cc_o_c)
> >
> > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \
> > -                                  $(patsubst %.c,lib-%.o,$(efi-deps-y))
> > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o
> >
> >  lib-$(CONFIG_ARM)              += arm32-stub.o
> >  lib-$(CONFIG_ARM64)            += arm64-stub.o
> > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> > index b7ef8d2df59e..7c684d10f936 100644
> > --- a/drivers/firmware/efi/libstub/loongarch-stub.c
> > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> > @@ -9,7 +9,8 @@
> >  #include <asm/addrspace.h>
> >  #include "efistub.h"
> >
> > -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt);
> > +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab,
> > +                                         unsigned long cmdline);
> >
> >  extern int kernel_asize;
> >  extern int kernel_fsize;
> > @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
> >         return status;
> >  }
> >
> > -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size)
> > +struct exit_boot_struct {
> > +       efi_memory_desc_t       *runtime_map;
> > +       int                     runtime_entry_count;
> > +};
> > +
> > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> > +{
> > +       struct exit_boot_struct *p = priv;
> > +
> > +       /*
> > +        * Update the memory map with virtual addresses. The function will also
> > +        * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME
> > +        * entries so that we can pass it straight to SetVirtualAddressMap()
> > +        */
> > +       efi_get_virtmap(map->map, map->map_size, map->desc_size,
> > +                       p->runtime_map, &p->runtime_entry_count);
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image,
> > +                            unsigned long image_addr, char *cmdline_ptr)
> >  {
> >         kernel_entry_t real_kernel_entry;
> > +       struct exit_boot_struct priv;
> > +       unsigned long desc_size;
> > +       efi_status_t status;
> > +       u32 desc_ver;
> > +
> > +       status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Unable to retrieve UEFI memory map.\n");
> > +               return status;
> > +       }
> > +
> > +       efi_info("Exiting boot services\n");
> > +
> > +       efi_novamap = false;
> > +       status = efi_exit_boot_services(handle, &priv, exit_boot_func);
> > +       if (status != EFI_SUCCESS)
> > +               return status;
> > +
> > +       /* Install the new virtual address map */
> > +       efi_rt_call(set_virtual_address_map,
> > +                   priv.runtime_entry_count * desc_size, desc_size,
> > +                   desc_ver, priv.runtime_map);
> >
> >         /* Config Direct Mapping */
> >         csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
> >         csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
> >
> >         real_kernel_entry = (kernel_entry_t)
> > -               ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS);
> > +               ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS);
> >
> > -       if (!efi_novamap)
> > -               real_kernel_entry(true, fdt);
> > -       else
> > -               real_kernel_entry(false, fdt);
> > +       real_kernel_entry(true, (unsigned long)efi_system_table,
> > +                         (unsigned long)cmdline_ptr);
> >  }
> > --
> > 2.35.1
> >
> >
Huacai Chen Sept. 19, 2022, 6:06 a.m. UTC | #3
Hi, Ard,

On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Ard,
> >
> > I think the parameters passed to the core kernel need to be discussed.
> > The old way (so-called old world):
> > a0=argc, a1=argv, a1=bpi
> >
> > The current way (so-called new world):
> > a0=efi boot flag, a1=fdt pointer
> >
> > The new way (in this patchset):
> > a0=efi boot flag, a1=systemtable, a2=cmdline
> >
> > I prefer to use the current way, there are 3 reasons:
> > 1, both acpi system and dt system can use the same parameters passing method;
>
> DT systems will use this too. The distinction is between EFI boot and
> non-EFI boot. We *really* should keep these separate, given the
> experience on ARM, where other projects invent ways to pass those
> values to the kernel without going through the stub.
In the last patch I see:
+               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
+
+               early_init_dt_scan(fdt_ptr);
+               early_init_fdt_reserve_self();
+
                clear_bit(EFI_BOOT, &efi.flags);
So I suppose for the DT system that means a0=efi boot flag, a1=fdt
pointer, a2=cmdline? Then it is not exactly the same as the ACPI
system, but similar.

>
> > 2, arm64, riscv and loongarch can use similar logics (light FDT);
>
> No need to repeat a mistake. I intend to fix RISC-V next.
>
> > 3, out-of-tree patches can make compatibility with the old world
> > easier by just judging whether a2 is zero.
> >
>
> The whole point of this series is that the EFI stub should not be
> touching the DT at all. In other words, there is no DT pointer, so the
> current method needs to be revised.
>
> What we might do is
>
> a0=systemtable, a1=cmdline
>
> as any non-zero value is treated as logic true. That way, your logic
> to test a2 is zero will still work.
I think the efi boot flag is still needed, even boot from efistub.
Because if boot with "efi=novamap", the efi runtime should be
disabled. Then we need efi_enabled(EFI_BOOT) to be false in
efi_init().

Huacai

>
>
> >
> > On Mon, Sep 19, 2022 at 5:36 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > LoongArch does not use FDT or DT natively [yet], and the only reason it
> > > currently uses it is so that it can reuse the existing EFI stub code.
> > >
> > > Overloading the DT with data passed between the EFI stub and the core
> > > kernel has been a source of problems: there is the overlap between
> > > information provided by EFI which DT can also provide (initrd base/size,
> > > command line, memory descriptions), requiring us to reason about which
> > > is which and what to prioritize. It has also resulted in ABI leaks,
> > > i.e., internal ABI being promoted to external ABI inadvertently because
> > > the bootloader can set the EFI stub's DT properties as well (e.g.,
> > > "kaslr-seed"). This has become especially problematic with boot
> > > environments that want to pretend that EFI boot is being done (to access
> > > ACPI and SMBIOS tables, for instance) but have no ability to execute the
> > > EFI stub, and so the environment that the EFI stub creates is emulated
> > > [poorly, in some cases].
> > >
> > > Another downside of treating DT like this is that the DT binary that the
> > > kernel receives is different from the one created by the firmware, which
> > > is undesirable in the context of secure and measured boot.
> > >
> > > Given that LoongArch support in Linux is brand new, we can avoid these
> > > pitfalls, and treat the DT strictly as a hardware description, and use a
> > > separate handover method between the EFI stub and the kernel. Now that
> > > initrd loading and passing the EFI memory map have been refactored into
> > > pure EFI routines that use EFI configuration tables, the only thing we
> > > need to pass directly is the kernel command line (even if we could pass
> > > this via a config table as well, it is used extremely early, so passing
> > > it directly is preferred in this case.)
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > ---
> > >  arch/loongarch/Kconfig                        |  3 --
> > >  arch/loongarch/include/asm/bootinfo.h         |  2 +-
> > >  arch/loongarch/kernel/efi.c                   | 30 ++++++++++-
> > >  arch/loongarch/kernel/env.c                   | 22 ++++----
> > >  arch/loongarch/kernel/head.S                  |  2 +
> > >  arch/loongarch/kernel/setup.c                 |  4 +-
> > >  drivers/firmware/efi/libstub/Makefile         | 13 +++--
> > >  drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++---
> > >  8 files changed, 100 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > > index fca106a8b8af..14a2a1ec8561 100644
> > > --- a/arch/loongarch/Kconfig
> > > +++ b/arch/loongarch/Kconfig
> > > @@ -104,8 +104,6 @@ config LOONGARCH
> > >         select MODULES_USE_ELF_RELA if MODULES
> > >         select NEED_PER_CPU_EMBED_FIRST_CHUNK
> > >         select NEED_PER_CPU_PAGE_FIRST_CHUNK
> > > -       select OF
> > > -       select OF_EARLY_FLATTREE
> > >         select PCI
> > >         select PCI_DOMAINS_GENERIC
> > >         select PCI_ECAM if ACPI
> > > @@ -311,7 +309,6 @@ config DMI
> > >  config EFI
> > >         bool "EFI runtime service support"
> > >         select UCS2_STRING
> > > -       select EFI_PARAMS_FROM_FDT
> > >         select EFI_RUNTIME_WRAPPERS
> > >         help
> > >           This enables the kernel to use EFI runtime services that are
> > > diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h
> > > index e02ac4af7f6e..8e5881bc5ad1 100644
> > > --- a/arch/loongarch/include/asm/bootinfo.h
> > > +++ b/arch/loongarch/include/asm/bootinfo.h
> > > @@ -36,7 +36,7 @@ struct loongson_system_configuration {
> > >  };
> > >
> > >  extern u64 efi_system_table;
> > > -extern unsigned long fw_arg0, fw_arg1;
> > > +extern unsigned long fw_arg0, fw_arg1, fw_arg2;
> > >  extern struct loongson_board_info b_info;
> > >  extern struct loongson_system_configuration loongson_sysconf;
> > >
> > > diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
> > > index 1f1f755fb425..3b80675726ec 100644
> > > --- a/arch/loongarch/kernel/efi.c
> > > +++ b/arch/loongarch/kernel/efi.c
> > > @@ -27,8 +27,13 @@
> > >  static unsigned long efi_nr_tables;
> > >  static unsigned long efi_config_table;
> > >
> > > +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR;
> > > +
> > >  static efi_system_table_t *efi_systab;
> > > -static efi_config_table_type_t arch_tables[] __initdata = {{},};
> > > +static efi_config_table_type_t arch_tables[] __initdata = {
> > > +       {LINUX_EFI_BOOT_MEMMAP_GUID,    &boot_memmap,   "MEMMAP" },
> > > +       {},
> > > +};
> > >
> > >  void __init efi_runtime_init(void)
> > >  {
> > > @@ -61,6 +66,8 @@ void __init efi_init(void)
> > >                 return;
> > >         }
> > >
> > > +       efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor);
> > > +
> > >         set_bit(EFI_64BIT, &efi.flags);
> > >         efi_nr_tables    = efi_systab->nr_tables;
> > >         efi_config_table = (unsigned long)efi_systab->tables;
> > > @@ -72,4 +79,25 @@ void __init efi_init(void)
> > >
> > >         if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI)
> > >                 memblock_reserve(screen_info.lfb_base, screen_info.lfb_size);
> > > +
> > > +       if (boot_memmap != EFI_INVALID_TABLE_ADDR) {
> > > +               struct efi_memory_map_data data;
> > > +               struct efi_boot_memmap *tbl;
> > > +
> > > +               tbl = early_memremap_ro(boot_memmap, sizeof(*tbl));
> > > +               if (tbl) {
> > > +                       data.phys_map           = boot_memmap + sizeof(*tbl);
> > > +                       data.size               = tbl->map_size;
> > > +                       data.desc_size          = tbl->desc_size;
> > > +                       data.desc_version       = tbl->desc_ver;
> > > +
> > > +                       if (efi_memmap_init_early(&data) < 0)
> > > +                               panic("Unable to map EFI memory map.\n");
> > > +
> > > +                       memblock_reserve(data.phys_map & PAGE_MASK,
> > > +                                        PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
> > > +
> > > +                       early_memunmap(tbl, sizeof(*tbl));
> > > +               }
> > > +       }
> > >  }
> > > diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c
> > > index 82b478a5c665..05c38d28476e 100644
> > > --- a/arch/loongarch/kernel/env.c
> > > +++ b/arch/loongarch/kernel/env.c
> > > @@ -8,7 +8,6 @@
> > >  #include <linux/efi.h>
> > >  #include <linux/export.h>
> > >  #include <linux/memblock.h>
> > > -#include <linux/of_fdt.h>
> > >  #include <asm/early_ioremap.h>
> > >  #include <asm/bootinfo.h>
> > >  #include <asm/loongson.h>
> > > @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf);
> > >  void __init init_environ(void)
> > >  {
> > >         int efi_boot = fw_arg0;
> > > -       struct efi_memory_map_data data;
> > > -       void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > >
> > > -       if (efi_boot)
> > > -               set_bit(EFI_BOOT, &efi.flags);
> > > -       else
> > > -               clear_bit(EFI_BOOT, &efi.flags);
> > > +       if (efi_boot) {
> > > +               char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE);
> > >
> > > -       early_init_dt_scan(fdt_ptr);
> > > -       early_init_fdt_reserve_self();
> > > -       efi_system_table = efi_get_fdt_params(&data);
> > > +               strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE);
> > > +               early_memunmap(cmdline, COMMAND_LINE_SIZE);
> > >
> > > -       efi_memmap_init_early(&data);
> > > -       memblock_reserve(data.phys_map & PAGE_MASK,
> > > -                        PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
> > > +               efi_system_table = fw_arg1;
> > > +               set_bit(EFI_BOOT, &efi.flags);
> > > +       } else {
> > > +               clear_bit(EFI_BOOT, &efi.flags);
> > > +       }
> > >  }
> > >
> > >  static int __init init_cpu_fullname(void)
> > > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> > > index 01bac62a6442..8f89f39fd31b 100644
> > > --- a/arch/loongarch/kernel/head.S
> > > +++ b/arch/loongarch/kernel/head.S
> > > @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry)                  # kernel entry point
> > >         st.d            a0, t0, 0               # firmware arguments
> > >         la              t0, fw_arg1
> > >         st.d            a1, t0, 0
> > > +       la              t0, fw_arg2
> > > +       st.d            a2, t0, 0
> > >
> > >         /* KSave3 used for percpu base, initialized as 0 */
> > >         csrwr           zero, PERCPU_BASE_KS
> > > diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> > > index e8714b1d94c8..7fabf2306e80 100644
> > > --- a/arch/loongarch/kernel/setup.c
> > > +++ b/arch/loongarch/kernel/setup.c
> > > @@ -51,7 +51,7 @@
> > >
> > >  struct screen_info screen_info __section(".data");
> > >
> > > -unsigned long fw_arg0, fw_arg1;
> > > +unsigned long fw_arg0, fw_arg1, fw_arg2;
> > >  DEFINE_PER_CPU(unsigned long, kernelsp);
> > >  struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly;
> > >
> > > @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem);
> > >
> > >  void __init platform_init(void)
> > >  {
> > > -       efi_init();
> > >  #ifdef CONFIG_ACPI_TABLE_UPGRADE
> > >         acpi_table_upgrade();
> > >  #endif
> > > @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p)
> > >         *cmdline_p = boot_command_line;
> > >
> > >         init_environ();
> > > +       efi_init();
> > >         memblock_init();
> > >         parse_early_param();
> > >
> > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > > index 0dbc6d93f2e6..d8d6657e6277 100644
> > > --- a/drivers/firmware/efi/libstub/Makefile
> > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >  cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                    -fpie
> > >
> > > -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > > +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)   += -I$(srctree)/scripts/dtc/libfdt
> > >
> > >  KBUILD_CFLAGS                  := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
> > >                                    -include $(srctree)/include/linux/hidden.h \
> > > @@ -60,14 +60,17 @@ lib-y                               := efi-stub-helper.o gop.o secureboot.o tpm.o \
> > >                                    alignedmem.o relocate.o vsprintf.o \
> > >                                    systable.o
> > >
> > > -# include the stub's generic dependencies from lib/ when building for ARM/arm64
> > > -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > > +# include the stub's libfdt dependencies from lib/ when needed
> > > +libfdt-deps                    := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \
> > > +                                  fdt_empty_tree.c fdt_sw.c
> > > +
> > > +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \
> > > +                                    $(patsubst %.c,lib-%.o,$(libfdt-deps))
> > >
> > >  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
> > >         $(call if_changed_rule,cc_o_c)
> > >
> > > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \
> > > -                                  $(patsubst %.c,lib-%.o,$(efi-deps-y))
> > > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o
> > >
> > >  lib-$(CONFIG_ARM)              += arm32-stub.o
> > >  lib-$(CONFIG_ARM64)            += arm64-stub.o
> > > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> > > index b7ef8d2df59e..7c684d10f936 100644
> > > --- a/drivers/firmware/efi/libstub/loongarch-stub.c
> > > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> > > @@ -9,7 +9,8 @@
> > >  #include <asm/addrspace.h>
> > >  #include "efistub.h"
> > >
> > > -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt);
> > > +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab,
> > > +                                         unsigned long cmdline);
> > >
> > >  extern int kernel_asize;
> > >  extern int kernel_fsize;
> > > @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
> > >         return status;
> > >  }
> > >
> > > -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size)
> > > +struct exit_boot_struct {
> > > +       efi_memory_desc_t       *runtime_map;
> > > +       int                     runtime_entry_count;
> > > +};
> > > +
> > > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> > > +{
> > > +       struct exit_boot_struct *p = priv;
> > > +
> > > +       /*
> > > +        * Update the memory map with virtual addresses. The function will also
> > > +        * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME
> > > +        * entries so that we can pass it straight to SetVirtualAddressMap()
> > > +        */
> > > +       efi_get_virtmap(map->map, map->map_size, map->desc_size,
> > > +                       p->runtime_map, &p->runtime_entry_count);
> > > +
> > > +       return EFI_SUCCESS;
> > > +}
> > > +
> > > +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image,
> > > +                            unsigned long image_addr, char *cmdline_ptr)
> > >  {
> > >         kernel_entry_t real_kernel_entry;
> > > +       struct exit_boot_struct priv;
> > > +       unsigned long desc_size;
> > > +       efi_status_t status;
> > > +       u32 desc_ver;
> > > +
> > > +       status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Unable to retrieve UEFI memory map.\n");
> > > +               return status;
> > > +       }
> > > +
> > > +       efi_info("Exiting boot services\n");
> > > +
> > > +       efi_novamap = false;
> > > +       status = efi_exit_boot_services(handle, &priv, exit_boot_func);
> > > +       if (status != EFI_SUCCESS)
> > > +               return status;
> > > +
> > > +       /* Install the new virtual address map */
> > > +       efi_rt_call(set_virtual_address_map,
> > > +                   priv.runtime_entry_count * desc_size, desc_size,
> > > +                   desc_ver, priv.runtime_map);
> > >
> > >         /* Config Direct Mapping */
> > >         csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
> > >         csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
> > >
> > >         real_kernel_entry = (kernel_entry_t)
> > > -               ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS);
> > > +               ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS);
> > >
> > > -       if (!efi_novamap)
> > > -               real_kernel_entry(true, fdt);
> > > -       else
> > > -               real_kernel_entry(false, fdt);
> > > +       real_kernel_entry(true, (unsigned long)efi_system_table,
> > > +                         (unsigned long)cmdline_ptr);
> > >  }
> > > --
> > > 2.35.1
> > >
> > >
Ard Biesheuvel Sept. 19, 2022, 6:22 a.m. UTC | #4
On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
>
>  Hi, Ard,
>
> On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Ard,
> > >
> > > I think the parameters passed to the core kernel need to be discussed.
> > > The old way (so-called old world):
> > > a0=argc, a1=argv, a1=bpi
> > >
> > > The current way (so-called new world):
> > > a0=efi boot flag, a1=fdt pointer
> > >
> > > The new way (in this patchset):
> > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > >
> > > I prefer to use the current way, there are 3 reasons:
> > > 1, both acpi system and dt system can use the same parameters passing method;
> >
> > DT systems will use this too. The distinction is between EFI boot and
> > non-EFI boot. We *really* should keep these separate, given the
> > experience on ARM, where other projects invent ways to pass those
> > values to the kernel without going through the stub.
> In the last patch I see:
> +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> +
> +               early_init_dt_scan(fdt_ptr);
> +               early_init_fdt_reserve_self();
> +
>                 clear_bit(EFI_BOOT, &efi.flags);
> So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> system, but similar.
>

No, for non-EFI DT boot, the command line is passed via the DT, so
a0=0x0 (non-efi), a1=DT, a2=0x0

Do you intend to support non-EFI DT boot by the way?

So

a2  != 0x0 means old world
a0 != 0x0 means EFI boot, a1 is the command line
a0 == 0x0 means !EFI boot, a1 is the DT

> >
> > > 2, arm64, riscv and loongarch can use similar logics (light FDT);
> >
> > No need to repeat a mistake. I intend to fix RISC-V next.
> >
> > > 3, out-of-tree patches can make compatibility with the old world
> > > easier by just judging whether a2 is zero.
> > >
> >
> > The whole point of this series is that the EFI stub should not be
> > touching the DT at all. In other words, there is no DT pointer, so the
> > current method needs to be revised.
> >
> > What we might do is
> >
> > a0=systemtable, a1=cmdline
> >
> > as any non-zero value is treated as logic true. That way, your logic
> > to test a2 is zero will still work.
> I think the efi boot flag is still needed, even boot from efistub.
> Because if boot with "efi=novamap", the efi runtime should be
> disabled. Then we need efi_enabled(EFI_BOOT) to be false in
> efi_init().
>

I don't think it makes sense to allow efi=novamap on LoongArch, given
that we cannot make use of the runtime services that way. So in my
code, efi_novamap is set to false unconditionally.
Ard Biesheuvel Sept. 19, 2022, 6:33 a.m. UTC | #5
On Mon, 19 Sept 2022 at 08:22, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> >  Hi, Ard,
> >
> > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > Hi, Ard,
> > > >
> > > > I think the parameters passed to the core kernel need to be discussed.
> > > > The old way (so-called old world):
> > > > a0=argc, a1=argv, a1=bpi
> > > >
> > > > The current way (so-called new world):
> > > > a0=efi boot flag, a1=fdt pointer
> > > >
> > > > The new way (in this patchset):
> > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > >
> > > > I prefer to use the current way, there are 3 reasons:
> > > > 1, both acpi system and dt system can use the same parameters passing method;
> > >
> > > DT systems will use this too. The distinction is between EFI boot and
> > > non-EFI boot. We *really* should keep these separate, given the
> > > experience on ARM, where other projects invent ways to pass those
> > > values to the kernel without going through the stub.
> > In the last patch I see:
> > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > +
> > +               early_init_dt_scan(fdt_ptr);
> > +               early_init_fdt_reserve_self();
> > +
> >                 clear_bit(EFI_BOOT, &efi.flags);
> > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > system, but similar.
> >
>
> No, for non-EFI DT boot, the command line is passed via the DT, so
> a0=0x0 (non-efi), a1=DT, a2=0x0
>
> Do you intend to support non-EFI DT boot by the way?
>
> So
>
> a2  != 0x0 means old world
> a0 != 0x0 means EFI boot, a1 is the command line
> a0 == 0x0 means !EFI boot, a1 is the DT
>

Note: the above applies if we decide to merge the EFI boolean and the
system table pointer into register #0.
Huacai Chen Sept. 19, 2022, 10:33 a.m. UTC | #6
Hi, Ard,

On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> >  Hi, Ard,
> >
> > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > Hi, Ard,
> > > >
> > > > I think the parameters passed to the core kernel need to be discussed.
> > > > The old way (so-called old world):
> > > > a0=argc, a1=argv, a1=bpi
> > > >
> > > > The current way (so-called new world):
> > > > a0=efi boot flag, a1=fdt pointer
> > > >
> > > > The new way (in this patchset):
> > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > >
> > > > I prefer to use the current way, there are 3 reasons:
> > > > 1, both acpi system and dt system can use the same parameters passing method;
> > >
> > > DT systems will use this too. The distinction is between EFI boot and
> > > non-EFI boot. We *really* should keep these separate, given the
> > > experience on ARM, where other projects invent ways to pass those
> > > values to the kernel without going through the stub.
> > In the last patch I see:
> > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > +
> > +               early_init_dt_scan(fdt_ptr);
> > +               early_init_fdt_reserve_self();
> > +
> >                 clear_bit(EFI_BOOT, &efi.flags);
> > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > system, but similar.
> >
>
> No, for non-EFI DT boot, the command line is passed via the DT, so
> a0=0x0 (non-efi), a1=DT, a2=0x0
>
> Do you intend to support non-EFI DT boot by the way?
I think we needn't support non-EFI DT boot, so a0=efi boot flag,
a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
which looks similar to the old-world). But I have another question: is
it early enough to get DT from systemtable for DT boot (in the current
way DT is the earliest thing)?

>
> So
>
> a2  != 0x0 means old world
> a0 != 0x0 means EFI boot, a1 is the command line
> a0 == 0x0 means !EFI boot, a1 is the DT
>
> > >
> > > > 2, arm64, riscv and loongarch can use similar logics (light FDT);
> > >
> > > No need to repeat a mistake. I intend to fix RISC-V next.
> > >
> > > > 3, out-of-tree patches can make compatibility with the old world
> > > > easier by just judging whether a2 is zero.
> > > >
> > >
> > > The whole point of this series is that the EFI stub should not be
> > > touching the DT at all. In other words, there is no DT pointer, so the
> > > current method needs to be revised.
> > >
> > > What we might do is
> > >
> > > a0=systemtable, a1=cmdline
> > >
> > > as any non-zero value is treated as logic true. That way, your logic
> > > to test a2 is zero will still work.
> > I think the efi boot flag is still needed, even boot from efistub.
> > Because if boot with "efi=novamap", the efi runtime should be
> > disabled. Then we need efi_enabled(EFI_BOOT) to be false in
> > efi_init().
> >
>
> I don't think it makes sense to allow efi=novamap on LoongArch, given
> that we cannot make use of the runtime services that way. So in my
> code, efi_novamap is set to false unconditionally.
Emm, I prefer to support "efi=novamap", the core kernel has already
supported "noefi" to disable runtime, I don't want to hack
efi_novamap. So please keep the efi boot flag, thanks.

Huacai
Ard Biesheuvel Sept. 19, 2022, 10:37 a.m. UTC | #7
On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Ard,
>
> On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > >  Hi, Ard,
> > >
> > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > Hi, Ard,
> > > > >
> > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > The old way (so-called old world):
> > > > > a0=argc, a1=argv, a1=bpi
> > > > >
> > > > > The current way (so-called new world):
> > > > > a0=efi boot flag, a1=fdt pointer
> > > > >
> > > > > The new way (in this patchset):
> > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > >
> > > > > I prefer to use the current way, there are 3 reasons:
> > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > >
> > > > DT systems will use this too. The distinction is between EFI boot and
> > > > non-EFI boot. We *really* should keep these separate, given the
> > > > experience on ARM, where other projects invent ways to pass those
> > > > values to the kernel without going through the stub.
> > > In the last patch I see:
> > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > +
> > > +               early_init_dt_scan(fdt_ptr);
> > > +               early_init_fdt_reserve_self();
> > > +
> > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > system, but similar.
> > >
> >
> > No, for non-EFI DT boot, the command line is passed via the DT, so
> > a0=0x0 (non-efi), a1=DT, a2=0x0
> >
> > Do you intend to support non-EFI DT boot by the way?
> I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> which looks similar to the old-world).

Excellent. If non-EFI boot is not supported, we can drop the code that
deals with that.

> But I have another question: is
> it early enough to get DT from systemtable for DT boot (in the current
> way DT is the earliest thing)?
>

If you rely on DT only for the hardware description, then loading it
from efi_init() should be fine.

> >
> > So
> >
> > a2  != 0x0 means old world
> > a0 != 0x0 means EFI boot, a1 is the command line
> > a0 == 0x0 means !EFI boot, a1 is the DT
> >
> > > >
> > > > > 2, arm64, riscv and loongarch can use similar logics (light FDT);
> > > >
> > > > No need to repeat a mistake. I intend to fix RISC-V next.
> > > >
> > > > > 3, out-of-tree patches can make compatibility with the old world
> > > > > easier by just judging whether a2 is zero.
> > > > >
> > > >
> > > > The whole point of this series is that the EFI stub should not be
> > > > touching the DT at all. In other words, there is no DT pointer, so the
> > > > current method needs to be revised.
> > > >
> > > > What we might do is
> > > >
> > > > a0=systemtable, a1=cmdline
> > > >
> > > > as any non-zero value is treated as logic true. That way, your logic
> > > > to test a2 is zero will still work.
> > > I think the efi boot flag is still needed, even boot from efistub.
> > > Because if boot with "efi=novamap", the efi runtime should be
> > > disabled. Then we need efi_enabled(EFI_BOOT) to be false in
> > > efi_init().
> > >
> >
> > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > that we cannot make use of the runtime services that way. So in my
> > code, efi_novamap is set to false unconditionally.
> Emm, I prefer to support "efi=novamap", the core kernel has already
> supported "noefi" to disable runtime, I don't want to hack
> efi_novamap. So please keep the efi boot flag, thanks.
>

Fair enough. Do you have any uses for efi_novamap in mind?
Huacai Chen Sept. 19, 2022, 10:47 a.m. UTC | #8
On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Ard,
> >
> > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > >  Hi, Ard,
> > > >
> > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > Hi, Ard,
> > > > > >
> > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > The old way (so-called old world):
> > > > > > a0=argc, a1=argv, a1=bpi
> > > > > >
> > > > > > The current way (so-called new world):
> > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > >
> > > > > > The new way (in this patchset):
> > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > >
> > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > >
> > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > experience on ARM, where other projects invent ways to pass those
> > > > > values to the kernel without going through the stub.
> > > > In the last patch I see:
> > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > +
> > > > +               early_init_dt_scan(fdt_ptr);
> > > > +               early_init_fdt_reserve_self();
> > > > +
> > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > system, but similar.
> > > >
> > >
> > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > >
> > > Do you intend to support non-EFI DT boot by the way?
> > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > which looks similar to the old-world).
>
> Excellent. If non-EFI boot is not supported, we can drop the code that
> deals with that.
>
> > But I have another question: is
> > it early enough to get DT from systemtable for DT boot (in the current
> > way DT is the earliest thing)?
> >
>
> If you rely on DT only for the hardware description, then loading it
> from efi_init() should be fine.
OK, then please drop patch #12 at this time. It can be added when we
add Loongson-2K support.

>
> > >
> > > So
> > >
> > > a2  != 0x0 means old world
> > > a0 != 0x0 means EFI boot, a1 is the command line
> > > a0 == 0x0 means !EFI boot, a1 is the DT
> > >
> > > > >
> > > > > > 2, arm64, riscv and loongarch can use similar logics (light FDT);
> > > > >
> > > > > No need to repeat a mistake. I intend to fix RISC-V next.
> > > > >
> > > > > > 3, out-of-tree patches can make compatibility with the old world
> > > > > > easier by just judging whether a2 is zero.
> > > > > >
> > > > >
> > > > > The whole point of this series is that the EFI stub should not be
> > > > > touching the DT at all. In other words, there is no DT pointer, so the
> > > > > current method needs to be revised.
> > > > >
> > > > > What we might do is
> > > > >
> > > > > a0=systemtable, a1=cmdline
> > > > >
> > > > > as any non-zero value is treated as logic true. That way, your logic
> > > > > to test a2 is zero will still work.
> > > > I think the efi boot flag is still needed, even boot from efistub.
> > > > Because if boot with "efi=novamap", the efi runtime should be
> > > > disabled. Then we need efi_enabled(EFI_BOOT) to be false in
> > > > efi_init().
> > > >
> > >
> > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > that we cannot make use of the runtime services that way. So in my
> > > code, efi_novamap is set to false unconditionally.
> > Emm, I prefer to support "efi=novamap", the core kernel has already
> > supported "noefi" to disable runtime, I don't want to hack
> > efi_novamap. So please keep the efi boot flag, thanks.
> >
>
> Fair enough. Do you have any uses for efi_novamap in mind?
I usually use "efi=novamap" in EFI shell to see whether it can work
well without runtime. And I think I will modify this patch these days
because in another thread you said that this series cannot boot.

Huacai
Ard Biesheuvel Sept. 19, 2022, 10:49 a.m. UTC | #9
On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Ard,
> > >
> > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > >  Hi, Ard,
> > > > >
> > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi, Ard,
> > > > > > >
> > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > The old way (so-called old world):
> > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > >
> > > > > > > The current way (so-called new world):
> > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > >
> > > > > > > The new way (in this patchset):
> > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > >
> > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > >
> > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > values to the kernel without going through the stub.
> > > > > In the last patch I see:
> > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > +
> > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > +               early_init_fdt_reserve_self();
> > > > > +
> > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > system, but similar.
> > > > >
> > > >
> > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > >
> > > > Do you intend to support non-EFI DT boot by the way?
> > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > which looks similar to the old-world).
> >
> > Excellent. If non-EFI boot is not supported, we can drop the code that
> > deals with that.
> >
> > > But I have another question: is
> > > it early enough to get DT from systemtable for DT boot (in the current
> > > way DT is the earliest thing)?
> > >
> >
> > If you rely on DT only for the hardware description, then loading it
> > from efi_init() should be fine.
> OK, then please drop patch #12 at this time. It can be added when we
> add Loongson-2K support.
>

OK

> > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > that we cannot make use of the runtime services that way. So in my
> > > > code, efi_novamap is set to false unconditionally.
> > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > supported "noefi" to disable runtime, I don't want to hack
> > > efi_novamap. So please keep the efi boot flag, thanks.
> > >
> >
> > Fair enough. Do you have any uses for efi_novamap in mind?
> I usually use "efi=novamap" in EFI shell to see whether it can work
> well without runtime. And I think I will modify this patch these days
> because in another thread you said that this series cannot boot.

It works fine now.

However, clearing the EFI_BOOT flag is not the correct way to disable
runtime services only. And note that we also have efi=noruntime - does
that currently work on LoongArch?
Huacai Chen Sept. 19, 2022, 11:15 a.m. UTC | #10
On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > Hi, Ard,
> > > >
> > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > >  Hi, Ard,
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > Hi, Ard,
> > > > > > > >
> > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > The old way (so-called old world):
> > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > >
> > > > > > > > The current way (so-called new world):
> > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > >
> > > > > > > > The new way (in this patchset):
> > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > >
> > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > >
> > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > values to the kernel without going through the stub.
> > > > > > In the last patch I see:
> > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > +
> > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > +               early_init_fdt_reserve_self();
> > > > > > +
> > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > system, but similar.
> > > > > >
> > > > >
> > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > >
> > > > > Do you intend to support non-EFI DT boot by the way?
> > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > which looks similar to the old-world).
> > >
> > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > deals with that.
> > >
> > > > But I have another question: is
> > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > way DT is the earliest thing)?
> > > >
> > >
> > > If you rely on DT only for the hardware description, then loading it
> > > from efi_init() should be fine.
> > OK, then please drop patch #12 at this time. It can be added when we
> > add Loongson-2K support.
> >
>
> OK
>
> > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > that we cannot make use of the runtime services that way. So in my
> > > > > code, efi_novamap is set to false unconditionally.
> > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > supported "noefi" to disable runtime, I don't want to hack
> > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > >
> > >
> > > Fair enough. Do you have any uses for efi_novamap in mind?
> > I usually use "efi=novamap" in EFI shell to see whether it can work
> > well without runtime. And I think I will modify this patch these days
> > because in another thread you said that this series cannot boot.
>
> It works fine now.
>
> However, clearing the EFI_BOOT flag is not the correct way to disable
> runtime services only. And note that we also have efi=noruntime - does
> that currently work on LoongArch?
OK, but we don't need to add a new "efi=noruntime" parameter, I can
use "noefi" instead. But I think support "efi=novamap" is not a bad
idea (maybe I misunderstood its usage).

Huacai
Ard Biesheuvel Sept. 19, 2022, 11:21 a.m. UTC | #11
On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > Hi, Ard,
> > > > >
> > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > >  Hi, Ard,
> > > > > > >
> > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > Hi, Ard,
> > > > > > > > >
> > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > The old way (so-called old world):
> > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > >
> > > > > > > > > The current way (so-called new world):
> > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > >
> > > > > > > > > The new way (in this patchset):
> > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > >
> > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > >
> > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > values to the kernel without going through the stub.
> > > > > > > In the last patch I see:
> > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > +
> > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > +
> > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > system, but similar.
> > > > > > >
> > > > > >
> > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > >
> > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > which looks similar to the old-world).
> > > >
> > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > deals with that.
> > > >
> > > > > But I have another question: is
> > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > way DT is the earliest thing)?
> > > > >
> > > >
> > > > If you rely on DT only for the hardware description, then loading it
> > > > from efi_init() should be fine.
> > > OK, then please drop patch #12 at this time. It can be added when we
> > > add Loongson-2K support.
> > >
> >
> > OK
> >
> > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > code, efi_novamap is set to false unconditionally.
> > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > >
> > > >
> > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > well without runtime. And I think I will modify this patch these days
> > > because in another thread you said that this series cannot boot.
> >
> > It works fine now.
> >
> > However, clearing the EFI_BOOT flag is not the correct way to disable
> > runtime services only. And note that we also have efi=noruntime - does
> > that currently work on LoongArch?
> OK, but we don't need to add a new "efi=noruntime" parameter, I can
> use "noefi" instead.

OK

> But I think support "efi=novamap" is not a bad
> idea (maybe I misunderstood its usage).
>

It basically exists to deal with broken EFI firmware on ARM laptops
that were only intended to run Windows. Windows calls
SetVirtualAddressMap() *after* creating the new mappings, and some
broken firmware drivers will access those regions too early.

On Linux, we install the mapping first, and then much later, we
actually create the mappings and only activate them on a single CPU
while the EFI runtime service call is in progress.

In any case, I will reinstate the efi_novamap logic for now - we can
always revisit it later.
Huacai Chen Sept. 19, 2022, 11:57 a.m. UTC | #12
On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > Hi, Ard,
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > >  Hi, Ard,
> > > > > > > >
> > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, Ard,
> > > > > > > > > >
> > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > >
> > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > >
> > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > >
> > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > >
> > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > In the last patch I see:
> > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > +
> > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > +
> > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > system, but similar.
> > > > > > > >
> > > > > > >
> > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > >
> > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > which looks similar to the old-world).
> > > > >
> > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > deals with that.
> > > > >
> > > > > > But I have another question: is
> > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > way DT is the earliest thing)?
> > > > > >
> > > > >
> > > > > If you rely on DT only for the hardware description, then loading it
> > > > > from efi_init() should be fine.
> > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > add Loongson-2K support.
> > > >
> > >
> > > OK
> > >
> > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > >
> > > > >
> > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > well without runtime. And I think I will modify this patch these days
> > > > because in another thread you said that this series cannot boot.
> > >
> > > It works fine now.
> > >
> > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > runtime services only. And note that we also have efi=noruntime - does
> > > that currently work on LoongArch?
> > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > use "noefi" instead.
>
> OK
>
> > But I think support "efi=novamap" is not a bad
> > idea (maybe I misunderstood its usage).
> >
>
> It basically exists to deal with broken EFI firmware on ARM laptops
> that were only intended to run Windows. Windows calls
> SetVirtualAddressMap() *after* creating the new mappings, and some
> broken firmware drivers will access those regions too early.
>
> On Linux, we install the mapping first, and then much later, we
> actually create the mappings and only activate them on a single CPU
> while the EFI runtime service call is in progress.
>
> In any case, I will reinstate the efi_novamap logic for now - we can
> always revisit it later.
OK, now I think there are no big problems. Only some bikesheddings:
1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
boot information, it looks most similar to the old-world, and we can
distinguish old-world/new-world by judging whether a0 is greater than
1.
2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
EFI_INVALID_TABLE_ADDR) then return immediately, this makes
indentation look better.
3, Declare "cmdline" out of the if() condition in init_environ() looks better.
4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
5, It seems that one line is enough for the last statement in efi_boot_kernel().

Hope this series will be stable as soon as possible, our kexec/kdump
work needs to adjust because of this change. :)

Huacai

>
Ard Biesheuvel Sept. 19, 2022, 12:10 p.m. UTC | #13
On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi, Ard,
> > > > > > >
> > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > >  Hi, Ard,
> > > > > > > > >
> > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi, Ard,
> > > > > > > > > > >
> > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > >
> > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > >
> > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > >
> > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > >
> > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > In the last patch I see:
> > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > +
> > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > +
> > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > system, but similar.
> > > > > > > > >
> > > > > > > >
> > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > >
> > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > which looks similar to the old-world).
> > > > > >
> > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > deals with that.
> > > > > >
> > > > > > > But I have another question: is
> > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > way DT is the earliest thing)?
> > > > > > >
> > > > > >
> > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > from efi_init() should be fine.
> > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > add Loongson-2K support.
> > > > >
> > > >
> > > > OK
> > > >
> > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > >
> > > > > >
> > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > well without runtime. And I think I will modify this patch these days
> > > > > because in another thread you said that this series cannot boot.
> > > >
> > > > It works fine now.
> > > >
> > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > runtime services only. And note that we also have efi=noruntime - does
> > > > that currently work on LoongArch?
> > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > use "noefi" instead.
> >
> > OK
> >
> > > But I think support "efi=novamap" is not a bad
> > > idea (maybe I misunderstood its usage).
> > >
> >
> > It basically exists to deal with broken EFI firmware on ARM laptops
> > that were only intended to run Windows. Windows calls
> > SetVirtualAddressMap() *after* creating the new mappings, and some
> > broken firmware drivers will access those regions too early.
> >
> > On Linux, we install the mapping first, and then much later, we
> > actually create the mappings and only activate them on a single CPU
> > while the EFI runtime service call is in progress.
> >
> > In any case, I will reinstate the efi_novamap logic for now - we can
> > always revisit it later.
> OK, now I think there are no big problems. Only some bikesheddings:
> 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> boot information, it looks most similar to the old-world, and we can
> distinguish old-world/new-world by judging whether a0 is greater than
> 1.
> 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> indentation look better.
> 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> 5, It seems that one line is enough for the last statement in efi_boot_kernel().
>

OK

> Hope this series will be stable as soon as possible, our kexec/kdump
> work needs to adjust because of this change. :)
>

Do you have a link to those patches?
Huacai Chen Sept. 19, 2022, 12:14 p.m. UTC | #14
On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > Hi, Ard,
> > > > > > > >
> > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > >  Hi, Ard,
> > > > > > > > > >
> > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > >
> > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > >
> > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > >
> > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > >
> > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > >
> > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > In the last patch I see:
> > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > +
> > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > +
> > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > system, but similar.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > >
> > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > which looks similar to the old-world).
> > > > > > >
> > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > deals with that.
> > > > > > >
> > > > > > > > But I have another question: is
> > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > way DT is the earliest thing)?
> > > > > > > >
> > > > > > >
> > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > from efi_init() should be fine.
> > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > add Loongson-2K support.
> > > > > >
> > > > >
> > > > > OK
> > > > >
> > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > >
> > > > > > >
> > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > because in another thread you said that this series cannot boot.
> > > > >
> > > > > It works fine now.
> > > > >
> > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > that currently work on LoongArch?
> > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > use "noefi" instead.
> > >
> > > OK
> > >
> > > > But I think support "efi=novamap" is not a bad
> > > > idea (maybe I misunderstood its usage).
> > > >
> > >
> > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > that were only intended to run Windows. Windows calls
> > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > broken firmware drivers will access those regions too early.
> > >
> > > On Linux, we install the mapping first, and then much later, we
> > > actually create the mappings and only activate them on a single CPU
> > > while the EFI runtime service call is in progress.
> > >
> > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > always revisit it later.
> > OK, now I think there are no big problems. Only some bikesheddings:
> > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > boot information, it looks most similar to the old-world, and we can
> > distinguish old-world/new-world by judging whether a0 is greater than
> > 1.
> > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > indentation look better.
> > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> >
>
> OK
>
> > Hope this series will be stable as soon as possible, our kexec/kdump
> > work needs to adjust because of this change. :)
> >
>
> Do you have a link to those patches?
There is a snapshot for patches pending for 6.1:
https://github.com/loongson/linux/commits/loongarch-next
Ard Biesheuvel Sept. 19, 2022, 12:27 p.m. UTC | #15
On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > Hi, Ard,
> > > > > > > > >
> > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > >
> > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > >
> > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > >
> > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > >
> > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > +
> > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > +
> > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > system, but similar.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > >
> > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > which looks similar to the old-world).
> > > > > > > >
> > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > deals with that.
> > > > > > > >
> > > > > > > > > But I have another question: is
> > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > way DT is the earliest thing)?
> > > > > > > > >
> > > > > > > >
> > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > from efi_init() should be fine.
> > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > add Loongson-2K support.
> > > > > > >
> > > > > >
> > > > > > OK
> > > > > >
> > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > because in another thread you said that this series cannot boot.
> > > > > >
> > > > > > It works fine now.
> > > > > >
> > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > that currently work on LoongArch?
> > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > use "noefi" instead.
> > > >
> > > > OK
> > > >
> > > > > But I think support "efi=novamap" is not a bad
> > > > > idea (maybe I misunderstood its usage).
> > > > >
> > > >
> > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > that were only intended to run Windows. Windows calls
> > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > broken firmware drivers will access those regions too early.
> > > >
> > > > On Linux, we install the mapping first, and then much later, we
> > > > actually create the mappings and only activate them on a single CPU
> > > > while the EFI runtime service call is in progress.
> > > >
> > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > always revisit it later.
> > > OK, now I think there are no big problems. Only some bikesheddings:
> > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > boot information, it looks most similar to the old-world, and we can
> > > distinguish old-world/new-world by judging whether a0 is greater than
> > > 1.
> > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > indentation look better.
> > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > >
> >
> > OK
> >
> > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > work needs to adjust because of this change. :)
> > >
> >
> > Do you have a link to those patches?
> There is a snapshot for patches pending for 6.1:
> https://github.com/loongson/linux/commits/loongarch-next

Thanks. I already spotted an issue there which is exactly the kind of
thing I am trying to avoid with this series.

diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
index 7423361b0ebc9b..c68c88e40cc0b9 100644
--- a/arch/loongarch/kernel/mem.c
+++ b/arch/loongarch/kernel/mem.c
@@ -61,4 +62,9 @@ void __init memblock_init(void)

  /* Reserve the initrd */
  reserve_initrd_mem();
+
+ /* Main reserved memory for the elf core head */
+ early_init_fdt_scan_reserved_mem();
+ /* Parse linux,usable-memory-range for crash dump kernel */
+ early_init_dt_check_for_usable_mem_range();
 }

So here, we are adding support for DT memory reservations, which kdump
apparently needs.

However, this parsing of the DT not only happens on kexec boot: all
ACPI and DT kernels will now honour FDT memory reservations passed in
via a DT when booting the first kernel, and external projects may use
this and start to depend on this.

Once something like this hits the upstream kernel, it is *very*
difficult to change or remove.

If you only need to pass the usable memory range for kcrash/kdump,
it's probably better to use a command line option for that, instead of
relying on DT memory reservations.
Huacai Chen Sept. 19, 2022, 2:25 p.m. UTC | #16
Hi, Ard,

On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, Ard,
> > > > > > > > > >
> > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > >
> > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > +
> > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > +
> > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > system, but similar.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > >
> > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > which looks similar to the old-world).
> > > > > > > > >
> > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > deals with that.
> > > > > > > > >
> > > > > > > > > > But I have another question: is
> > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > from efi_init() should be fine.
> > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > add Loongson-2K support.
> > > > > > > >
> > > > > > >
> > > > > > > OK
> > > > > > >
> > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > >
> > > > > > > It works fine now.
> > > > > > >
> > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > that currently work on LoongArch?
> > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > use "noefi" instead.
> > > > >
> > > > > OK
> > > > >
> > > > > > But I think support "efi=novamap" is not a bad
> > > > > > idea (maybe I misunderstood its usage).
> > > > > >
> > > > >
> > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > that were only intended to run Windows. Windows calls
> > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > broken firmware drivers will access those regions too early.
> > > > >
> > > > > On Linux, we install the mapping first, and then much later, we
> > > > > actually create the mappings and only activate them on a single CPU
> > > > > while the EFI runtime service call is in progress.
> > > > >
> > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > always revisit it later.
> > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > boot information, it looks most similar to the old-world, and we can
> > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > 1.
> > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > indentation look better.
> > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > >
> > >
> > > OK
> > >
> > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > work needs to adjust because of this change. :)
> > > >
> > >
> > > Do you have a link to those patches?
> > There is a snapshot for patches pending for 6.1:
> > https://github.com/loongson/linux/commits/loongarch-next
>
> Thanks. I already spotted an issue there which is exactly the kind of
> thing I am trying to avoid with this series.
>
> diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> index 7423361b0ebc9b..c68c88e40cc0b9 100644
> --- a/arch/loongarch/kernel/mem.c
> +++ b/arch/loongarch/kernel/mem.c
> @@ -61,4 +62,9 @@ void __init memblock_init(void)
>
>   /* Reserve the initrd */
>   reserve_initrd_mem();
> +
> + /* Main reserved memory for the elf core head */
> + early_init_fdt_scan_reserved_mem();
> + /* Parse linux,usable-memory-range for crash dump kernel */
> + early_init_dt_check_for_usable_mem_range();
>  }
>
> So here, we are adding support for DT memory reservations, which kdump
> apparently needs.
>
> However, this parsing of the DT not only happens on kexec boot: all
> ACPI and DT kernels will now honour FDT memory reservations passed in
> via a DT when booting the first kernel, and external projects may use
> this and start to depend on this.
>
> Once something like this hits the upstream kernel, it is *very*
> difficult to change or remove.
>
> If you only need to pass the usable memory range for kcrash/kdump,
> it's probably better to use a command line option for that, instead of
> relying on DT memory reservations.
Thank you for your suggestion, but we found some trouble when handle
initrd in kexec.
In current implementation, we generate a fdt in kexec-tools, then fill
"linux,usable-memory-range", "linux,elfcorehdr" and initrd information
in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
cmdline, but how to handle initrd information which is stored in a
system table?

Huacai
Ard Biesheuvel Sept. 19, 2022, 2:32 p.m. UTC | #17
On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Ard,
>
> On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi, Ard,
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > > +
> > > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > > +
> > > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > > system, but similar.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > > >
> > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > > which looks similar to the old-world).
> > > > > > > > > >
> > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > > deals with that.
> > > > > > > > > >
> > > > > > > > > > > But I have another question: is
> > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > > from efi_init() should be fine.
> > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > > add Loongson-2K support.
> > > > > > > > >
> > > > > > > >
> > > > > > > > OK
> > > > > > > >
> > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > > >
> > > > > > > > It works fine now.
> > > > > > > >
> > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > > that currently work on LoongArch?
> > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > > use "noefi" instead.
> > > > > >
> > > > > > OK
> > > > > >
> > > > > > > But I think support "efi=novamap" is not a bad
> > > > > > > idea (maybe I misunderstood its usage).
> > > > > > >
> > > > > >
> > > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > > that were only intended to run Windows. Windows calls
> > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > > broken firmware drivers will access those regions too early.
> > > > > >
> > > > > > On Linux, we install the mapping first, and then much later, we
> > > > > > actually create the mappings and only activate them on a single CPU
> > > > > > while the EFI runtime service call is in progress.
> > > > > >
> > > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > > always revisit it later.
> > > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > > boot information, it looks most similar to the old-world, and we can
> > > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > > 1.
> > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > > indentation look better.
> > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > > >
> > > >
> > > > OK
> > > >
> > > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > > work needs to adjust because of this change. :)
> > > > >
> > > >
> > > > Do you have a link to those patches?
> > > There is a snapshot for patches pending for 6.1:
> > > https://github.com/loongson/linux/commits/loongarch-next
> >
> > Thanks. I already spotted an issue there which is exactly the kind of
> > thing I am trying to avoid with this series.
> >
> > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> > index 7423361b0ebc9b..c68c88e40cc0b9 100644
> > --- a/arch/loongarch/kernel/mem.c
> > +++ b/arch/loongarch/kernel/mem.c
> > @@ -61,4 +62,9 @@ void __init memblock_init(void)
> >
> >   /* Reserve the initrd */
> >   reserve_initrd_mem();
> > +
> > + /* Main reserved memory for the elf core head */
> > + early_init_fdt_scan_reserved_mem();
> > + /* Parse linux,usable-memory-range for crash dump kernel */
> > + early_init_dt_check_for_usable_mem_range();
> >  }
> >
> > So here, we are adding support for DT memory reservations, which kdump
> > apparently needs.
> >
> > However, this parsing of the DT not only happens on kexec boot: all
> > ACPI and DT kernels will now honour FDT memory reservations passed in
> > via a DT when booting the first kernel, and external projects may use
> > this and start to depend on this.
> >
> > Once something like this hits the upstream kernel, it is *very*
> > difficult to change or remove.
> >
> > If you only need to pass the usable memory range for kcrash/kdump,
> > it's probably better to use a command line option for that, instead of
> > relying on DT memory reservations.
> Thank you for your suggestion, but we found some trouble when handle
> initrd in kexec.
> In current implementation, we generate a fdt in kexec-tools, then fill
> "linux,usable-memory-range", "linux,elfcorehdr" and initrd information
> in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
> cmdline, but how to handle initrd information which is stored in a
> system table?
>

There are two options:
- use initrdmem= on the command line
- update the INITRD config table in memory (i.e., update the base and
size to refer to the new initrd image)
Huacai Chen Sept. 19, 2022, 2:43 p.m. UTC | #18
On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Ard,
> >
> > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > > > system, but similar.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > > > >
> > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > > > which looks similar to the old-world).
> > > > > > > > > > >
> > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > > > deals with that.
> > > > > > > > > > >
> > > > > > > > > > > > But I have another question: is
> > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > > > from efi_init() should be fine.
> > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > > > add Loongson-2K support.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > OK
> > > > > > > > >
> > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > > > >
> > > > > > > > > It works fine now.
> > > > > > > > >
> > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > > > that currently work on LoongArch?
> > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > > > use "noefi" instead.
> > > > > > >
> > > > > > > OK
> > > > > > >
> > > > > > > > But I think support "efi=novamap" is not a bad
> > > > > > > > idea (maybe I misunderstood its usage).
> > > > > > > >
> > > > > > >
> > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > > > that were only intended to run Windows. Windows calls
> > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > > > broken firmware drivers will access those regions too early.
> > > > > > >
> > > > > > > On Linux, we install the mapping first, and then much later, we
> > > > > > > actually create the mappings and only activate them on a single CPU
> > > > > > > while the EFI runtime service call is in progress.
> > > > > > >
> > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > > > always revisit it later.
> > > > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > > > boot information, it looks most similar to the old-world, and we can
> > > > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > > > 1.
> > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > > > indentation look better.
> > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > > > >
> > > > >
> > > > > OK
> > > > >
> > > > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > > > work needs to adjust because of this change. :)
> > > > > >
> > > > >
> > > > > Do you have a link to those patches?
> > > > There is a snapshot for patches pending for 6.1:
> > > > https://github.com/loongson/linux/commits/loongarch-next
> > >
> > > Thanks. I already spotted an issue there which is exactly the kind of
> > > thing I am trying to avoid with this series.
> > >
> > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> > > index 7423361b0ebc9b..c68c88e40cc0b9 100644
> > > --- a/arch/loongarch/kernel/mem.c
> > > +++ b/arch/loongarch/kernel/mem.c
> > > @@ -61,4 +62,9 @@ void __init memblock_init(void)
> > >
> > >   /* Reserve the initrd */
> > >   reserve_initrd_mem();
> > > +
> > > + /* Main reserved memory for the elf core head */
> > > + early_init_fdt_scan_reserved_mem();
> > > + /* Parse linux,usable-memory-range for crash dump kernel */
> > > + early_init_dt_check_for_usable_mem_range();
> > >  }
> > >
> > > So here, we are adding support for DT memory reservations, which kdump
> > > apparently needs.
> > >
> > > However, this parsing of the DT not only happens on kexec boot: all
> > > ACPI and DT kernels will now honour FDT memory reservations passed in
> > > via a DT when booting the first kernel, and external projects may use
> > > this and start to depend on this.
> > >
> > > Once something like this hits the upstream kernel, it is *very*
> > > difficult to change or remove.
> > >
> > > If you only need to pass the usable memory range for kcrash/kdump,
> > > it's probably better to use a command line option for that, instead of
> > > relying on DT memory reservations.
> > Thank you for your suggestion, but we found some trouble when handle
> > initrd in kexec.
> > In current implementation, we generate a fdt in kexec-tools, then fill
> > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information
> > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
> > cmdline, but how to handle initrd information which is stored in a
> > system table?
> >
>
> There are two options:
> - use initrdmem= on the command line
This is not a good way, even if the second kernel handle "initrdmem=",
it will conflict with the config table.

> - update the INITRD config table in memory (i.e., update the base and
> size to refer to the new initrd image)
This way seems practicable, but we don't know how to handle it in
kexec-tools. :(

Huacai
Ard Biesheuvel Sept. 19, 2022, 2:44 p.m. UTC | #19
On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Ard,
> > >
> > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > > > > system, but similar.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > > > > which looks similar to the old-world).
> > > > > > > > > > > >
> > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > > > > deals with that.
> > > > > > > > > > > >
> > > > > > > > > > > > > But I have another question: is
> > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > > > > from efi_init() should be fine.
> > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > > > > add Loongson-2K support.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > OK
> > > > > > > > > >
> > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > > > > >
> > > > > > > > > > It works fine now.
> > > > > > > > > >
> > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > > > > that currently work on LoongArch?
> > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > > > > use "noefi" instead.
> > > > > > > >
> > > > > > > > OK
> > > > > > > >
> > > > > > > > > But I think support "efi=novamap" is not a bad
> > > > > > > > > idea (maybe I misunderstood its usage).
> > > > > > > > >
> > > > > > > >
> > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > > > > that were only intended to run Windows. Windows calls
> > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > > > > broken firmware drivers will access those regions too early.
> > > > > > > >
> > > > > > > > On Linux, we install the mapping first, and then much later, we
> > > > > > > > actually create the mappings and only activate them on a single CPU
> > > > > > > > while the EFI runtime service call is in progress.
> > > > > > > >
> > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > > > > always revisit it later.
> > > > > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > > > > boot information, it looks most similar to the old-world, and we can
> > > > > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > > > > 1.
> > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > > > > indentation look better.
> > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > > > > >
> > > > > >
> > > > > > OK
> > > > > >
> > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > > > > work needs to adjust because of this change. :)
> > > > > > >
> > > > > >
> > > > > > Do you have a link to those patches?
> > > > > There is a snapshot for patches pending for 6.1:
> > > > > https://github.com/loongson/linux/commits/loongarch-next
> > > >
> > > > Thanks. I already spotted an issue there which is exactly the kind of
> > > > thing I am trying to avoid with this series.
> > > >
> > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644
> > > > --- a/arch/loongarch/kernel/mem.c
> > > > +++ b/arch/loongarch/kernel/mem.c
> > > > @@ -61,4 +62,9 @@ void __init memblock_init(void)
> > > >
> > > >   /* Reserve the initrd */
> > > >   reserve_initrd_mem();
> > > > +
> > > > + /* Main reserved memory for the elf core head */
> > > > + early_init_fdt_scan_reserved_mem();
> > > > + /* Parse linux,usable-memory-range for crash dump kernel */
> > > > + early_init_dt_check_for_usable_mem_range();
> > > >  }
> > > >
> > > > So here, we are adding support for DT memory reservations, which kdump
> > > > apparently needs.
> > > >
> > > > However, this parsing of the DT not only happens on kexec boot: all
> > > > ACPI and DT kernels will now honour FDT memory reservations passed in
> > > > via a DT when booting the first kernel, and external projects may use
> > > > this and start to depend on this.
> > > >
> > > > Once something like this hits the upstream kernel, it is *very*
> > > > difficult to change or remove.
> > > >
> > > > If you only need to pass the usable memory range for kcrash/kdump,
> > > > it's probably better to use a command line option for that, instead of
> > > > relying on DT memory reservations.
> > > Thank you for your suggestion, but we found some trouble when handle
> > > initrd in kexec.
> > > In current implementation, we generate a fdt in kexec-tools, then fill
> > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information
> > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
> > > cmdline, but how to handle initrd information which is stored in a
> > > system table?
> > >
> >
> > There are two options:
> > - use initrdmem= on the command line
> This is not a good way, even if the second kernel handle "initrdmem=",
> it will conflict with the config table.
>

It will not conflict - initrdmem= supersedes the INITRD table because
early param passing happens after efi_init().

> > - update the INITRD config table in memory (i.e., update the base and
> > size to refer to the new initrd image)
> This way seems practicable, but we don't know how to handle it in
> kexec-tools. :(
>

Good point. Let me think a bit about this.
Huacai Chen Sept. 19, 2022, 3:08 p.m. UTC | #20
On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > Hi, Ard,
> > > >
> > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > > > > > system, but similar.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > > > > > which looks similar to the old-world).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > > > > > deals with that.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > But I have another question: is
> > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > > > > > from efi_init() should be fine.
> > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > > > > > add Loongson-2K support.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > OK
> > > > > > > > > > >
> > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > > > > > >
> > > > > > > > > > > It works fine now.
> > > > > > > > > > >
> > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > > > > > that currently work on LoongArch?
> > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > > > > > use "noefi" instead.
> > > > > > > > >
> > > > > > > > > OK
> > > > > > > > >
> > > > > > > > > > But I think support "efi=novamap" is not a bad
> > > > > > > > > > idea (maybe I misunderstood its usage).
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > > > > > that were only intended to run Windows. Windows calls
> > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > > > > > broken firmware drivers will access those regions too early.
> > > > > > > > >
> > > > > > > > > On Linux, we install the mapping first, and then much later, we
> > > > > > > > > actually create the mappings and only activate them on a single CPU
> > > > > > > > > while the EFI runtime service call is in progress.
> > > > > > > > >
> > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > > > > > always revisit it later.
> > > > > > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > > > > > boot information, it looks most similar to the old-world, and we can
> > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > > > > > 1.
> > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > > > > > indentation look better.
> > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > > > > > >
> > > > > > >
> > > > > > > OK
> > > > > > >
> > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > > > > > work needs to adjust because of this change. :)
> > > > > > > >
> > > > > > >
> > > > > > > Do you have a link to those patches?
> > > > > > There is a snapshot for patches pending for 6.1:
> > > > > > https://github.com/loongson/linux/commits/loongarch-next
> > > > >
> > > > > Thanks. I already spotted an issue there which is exactly the kind of
> > > > > thing I am trying to avoid with this series.
> > > > >
> > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644
> > > > > --- a/arch/loongarch/kernel/mem.c
> > > > > +++ b/arch/loongarch/kernel/mem.c
> > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void)
> > > > >
> > > > >   /* Reserve the initrd */
> > > > >   reserve_initrd_mem();
> > > > > +
> > > > > + /* Main reserved memory for the elf core head */
> > > > > + early_init_fdt_scan_reserved_mem();
> > > > > + /* Parse linux,usable-memory-range for crash dump kernel */
> > > > > + early_init_dt_check_for_usable_mem_range();
> > > > >  }
> > > > >
> > > > > So here, we are adding support for DT memory reservations, which kdump
> > > > > apparently needs.
> > > > >
> > > > > However, this parsing of the DT not only happens on kexec boot: all
> > > > > ACPI and DT kernels will now honour FDT memory reservations passed in
> > > > > via a DT when booting the first kernel, and external projects may use
> > > > > this and start to depend on this.
> > > > >
> > > > > Once something like this hits the upstream kernel, it is *very*
> > > > > difficult to change or remove.
> > > > >
> > > > > If you only need to pass the usable memory range for kcrash/kdump,
> > > > > it's probably better to use a command line option for that, instead of
> > > > > relying on DT memory reservations.
> > > > Thank you for your suggestion, but we found some trouble when handle
> > > > initrd in kexec.
> > > > In current implementation, we generate a fdt in kexec-tools, then fill
> > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information
> > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
> > > > cmdline, but how to handle initrd information which is stored in a
> > > > system table?
> > > >
> > >
> > > There are two options:
> > > - use initrdmem= on the command line
> > This is not a good way, even if the second kernel handle "initrdmem=",
> > it will conflict with the config table.
> >
>
> It will not conflict - initrdmem= supersedes the INITRD table because
> early param passing happens after efi_init().
>
> > > - update the INITRD config table in memory (i.e., update the base and
> > > size to refer to the new initrd image)
> > This way seems practicable, but we don't know how to handle it in
> > kexec-tools. :(
> >
>
> Good point. Let me think a bit about this.
OK, then let's go back to this series itself. :)

I have seen the latest code in your git repo, I don't think we need
efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as
before seems reasonable.

If efi_novamap breaks something, I can accept "set efi_novamap to
false" in the previous version, or just ignore its value in the
efistub, but a0 should clearly be the indicator of EFI_BOOT.

Huacai
Ard Biesheuvel Sept. 19, 2022, 3:50 p.m. UTC | #21
On Mon, 19 Sept 2022 at 17:09, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > Hi, Ard,
> > > > >
> > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > > > > > > system, but similar.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > > > > > > which looks similar to the old-world).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > > > > > > deals with that.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > But I have another question: is
> > > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > > > > > > from efi_init() should be fine.
> > > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > > > > > > add Loongson-2K support.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > OK
> > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > > > > > > >
> > > > > > > > > > > > It works fine now.
> > > > > > > > > > > >
> > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > > > > > > that currently work on LoongArch?
> > > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > > > > > > use "noefi" instead.
> > > > > > > > > >
> > > > > > > > > > OK
> > > > > > > > > >
> > > > > > > > > > > But I think support "efi=novamap" is not a bad
> > > > > > > > > > > idea (maybe I misunderstood its usage).
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > > > > > > that were only intended to run Windows. Windows calls
> > > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > > > > > > broken firmware drivers will access those regions too early.
> > > > > > > > > >
> > > > > > > > > > On Linux, we install the mapping first, and then much later, we
> > > > > > > > > > actually create the mappings and only activate them on a single CPU
> > > > > > > > > > while the EFI runtime service call is in progress.
> > > > > > > > > >
> > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > > > > > > always revisit it later.
> > > > > > > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > > > > > > boot information, it looks most similar to the old-world, and we can
> > > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > > > > > > 1.
> > > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > > > > > > indentation look better.
> > > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > > > > > > >
> > > > > > > >
> > > > > > > > OK
> > > > > > > >
> > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > > > > > > work needs to adjust because of this change. :)
> > > > > > > > >
> > > > > > > >
> > > > > > > > Do you have a link to those patches?
> > > > > > > There is a snapshot for patches pending for 6.1:
> > > > > > > https://github.com/loongson/linux/commits/loongarch-next
> > > > > >
> > > > > > Thanks. I already spotted an issue there which is exactly the kind of
> > > > > > thing I am trying to avoid with this series.
> > > > > >
> > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> > > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644
> > > > > > --- a/arch/loongarch/kernel/mem.c
> > > > > > +++ b/arch/loongarch/kernel/mem.c
> > > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void)
> > > > > >
> > > > > >   /* Reserve the initrd */
> > > > > >   reserve_initrd_mem();
> > > > > > +
> > > > > > + /* Main reserved memory for the elf core head */
> > > > > > + early_init_fdt_scan_reserved_mem();
> > > > > > + /* Parse linux,usable-memory-range for crash dump kernel */
> > > > > > + early_init_dt_check_for_usable_mem_range();
> > > > > >  }
> > > > > >
> > > > > > So here, we are adding support for DT memory reservations, which kdump
> > > > > > apparently needs.
> > > > > >
> > > > > > However, this parsing of the DT not only happens on kexec boot: all
> > > > > > ACPI and DT kernels will now honour FDT memory reservations passed in
> > > > > > via a DT when booting the first kernel, and external projects may use
> > > > > > this and start to depend on this.
> > > > > >
> > > > > > Once something like this hits the upstream kernel, it is *very*
> > > > > > difficult to change or remove.
> > > > > >
> > > > > > If you only need to pass the usable memory range for kcrash/kdump,
> > > > > > it's probably better to use a command line option for that, instead of
> > > > > > relying on DT memory reservations.
> > > > > Thank you for your suggestion, but we found some trouble when handle
> > > > > initrd in kexec.
> > > > > In current implementation, we generate a fdt in kexec-tools, then fill
> > > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information
> > > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
> > > > > cmdline, but how to handle initrd information which is stored in a
> > > > > system table?
> > > > >
> > > >
> > > > There are two options:
> > > > - use initrdmem= on the command line
> > > This is not a good way, even if the second kernel handle "initrdmem=",
> > > it will conflict with the config table.
> > >
> >
> > It will not conflict - initrdmem= supersedes the INITRD table because
> > early param passing happens after efi_init().
> >
> > > > - update the INITRD config table in memory (i.e., update the base and
> > > > size to refer to the new initrd image)
> > > This way seems practicable, but we don't know how to handle it in
> > > kexec-tools. :(
> > >
> >
> > Good point. Let me think a bit about this.
> OK, then let's go back to this series itself. :)
>
> I have seen the latest code in your git repo, I don't think we need
> efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as
> before seems reasonable.
>

No, not really. EFI_BOOT basically means 'did I boot via EFI?' and
strictly, you should not be attempting to access the EFI system table
or parse EFI config tables unless EFI_BOOT is true.

Deviating from this makes it more difficult for generic code to reason
about what parts of EFI are active on a given system.

> If efi_novamap breaks something, I can accept "set efi_novamap to
> false" in the previous version, or just ignore its value in the
> efistub, but a0 should clearly be the indicator of EFI_BOOT.
>

I'm fine with keeping efi_novamap if you think it has a value. To me,
it seems rather pointless, because it prevents you from using the
runtime services.

If you want a0 to control the EFI boot flag, you should find another
way to control whether runtime services can be used, because they are
really two different things.
Huacai Chen Sept. 20, 2022, 1:44 a.m. UTC | #22
Hi, Ard,

On Mon, Sep 19, 2022 at 11:50 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 19 Sept 2022 at 17:09, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > Hi, Ard,
> > > > > >
> > > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >  Hi, Ard,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hi, Ard,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed.
> > > > > > > > > > > > > > > > > > > > The old way (so-called old world):
> > > > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > The current way (so-called new world):
> > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > The new way (in this patchset):
> > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons:
> > > > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method;
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and
> > > > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the
> > > > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those
> > > > > > > > > > > > > > > > > > > values to the kernel without going through the stub.
> > > > > > > > > > > > > > > > > > In the last patch I see:
> > > > > > > > > > > > > > > > > > +               void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
> > > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > > +               early_init_dt_scan(fdt_ptr);
> > > > > > > > > > > > > > > > > > +               early_init_fdt_reserve_self();
> > > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > >                 clear_bit(EFI_BOOT, &efi.flags);
> > > > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt
> > > > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI
> > > > > > > > > > > > > > > > > > system, but similar.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so
> > > > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way?
> > > > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag,
> > > > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2,
> > > > > > > > > > > > > > > > which looks similar to the old-world).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that
> > > > > > > > > > > > > > > deals with that.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > But I have another question: is
> > > > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current
> > > > > > > > > > > > > > > > way DT is the earliest thing)?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it
> > > > > > > > > > > > > > > from efi_init() should be fine.
> > > > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we
> > > > > > > > > > > > > > add Loongson-2K support.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > OK
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given
> > > > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my
> > > > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally.
> > > > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already
> > > > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack
> > > > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind?
> > > > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work
> > > > > > > > > > > > > > well without runtime. And I think I will modify this patch these days
> > > > > > > > > > > > > > because in another thread you said that this series cannot boot.
> > > > > > > > > > > > >
> > > > > > > > > > > > > It works fine now.
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable
> > > > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does
> > > > > > > > > > > > > that currently work on LoongArch?
> > > > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can
> > > > > > > > > > > > use "noefi" instead.
> > > > > > > > > > >
> > > > > > > > > > > OK
> > > > > > > > > > >
> > > > > > > > > > > > But I think support "efi=novamap" is not a bad
> > > > > > > > > > > > idea (maybe I misunderstood its usage).
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops
> > > > > > > > > > > that were only intended to run Windows. Windows calls
> > > > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some
> > > > > > > > > > > broken firmware drivers will access those regions too early.
> > > > > > > > > > >
> > > > > > > > > > > On Linux, we install the mapping first, and then much later, we
> > > > > > > > > > > actually create the mappings and only activate them on a single CPU
> > > > > > > > > > > while the EFI runtime service call is in progress.
> > > > > > > > > > >
> > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can
> > > > > > > > > > > always revisit it later.
> > > > > > > > > > OK, now I think there are no big problems. Only some bikesheddings:
> > > > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass
> > > > > > > > > > boot information, it looks most similar to the old-world, and we can
> > > > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than
> > > > > > > > > > 1.
> > > > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap ==
> > > > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes
> > > > > > > > > > indentation look better.
> > > > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better.
> > > > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel().
> > > > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel().
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > OK
> > > > > > > > >
> > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump
> > > > > > > > > > work needs to adjust because of this change. :)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Do you have a link to those patches?
> > > > > > > > There is a snapshot for patches pending for 6.1:
> > > > > > > > https://github.com/loongson/linux/commits/loongarch-next
> > > > > > >
> > > > > > > Thanks. I already spotted an issue there which is exactly the kind of
> > > > > > > thing I am trying to avoid with this series.
> > > > > > >
> > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
> > > > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644
> > > > > > > --- a/arch/loongarch/kernel/mem.c
> > > > > > > +++ b/arch/loongarch/kernel/mem.c
> > > > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void)
> > > > > > >
> > > > > > >   /* Reserve the initrd */
> > > > > > >   reserve_initrd_mem();
> > > > > > > +
> > > > > > > + /* Main reserved memory for the elf core head */
> > > > > > > + early_init_fdt_scan_reserved_mem();
> > > > > > > + /* Parse linux,usable-memory-range for crash dump kernel */
> > > > > > > + early_init_dt_check_for_usable_mem_range();
> > > > > > >  }
> > > > > > >
> > > > > > > So here, we are adding support for DT memory reservations, which kdump
> > > > > > > apparently needs.
> > > > > > >
> > > > > > > However, this parsing of the DT not only happens on kexec boot: all
> > > > > > > ACPI and DT kernels will now honour FDT memory reservations passed in
> > > > > > > via a DT when booting the first kernel, and external projects may use
> > > > > > > this and start to depend on this.
> > > > > > >
> > > > > > > Once something like this hits the upstream kernel, it is *very*
> > > > > > > difficult to change or remove.
> > > > > > >
> > > > > > > If you only need to pass the usable memory range for kcrash/kdump,
> > > > > > > it's probably better to use a command line option for that, instead of
> > > > > > > relying on DT memory reservations.
> > > > > > Thank you for your suggestion, but we found some trouble when handle
> > > > > > initrd in kexec.
> > > > > > In current implementation, we generate a fdt in kexec-tools, then fill
> > > > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information
> > > > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in
> > > > > > cmdline, but how to handle initrd information which is stored in a
> > > > > > system table?
> > > > > >
> > > > >
> > > > > There are two options:
> > > > > - use initrdmem= on the command line
> > > > This is not a good way, even if the second kernel handle "initrdmem=",
> > > > it will conflict with the config table.
> > > >
> > >
> > > It will not conflict - initrdmem= supersedes the INITRD table because
> > > early param passing happens after efi_init().
> > >
> > > > > - update the INITRD config table in memory (i.e., update the base and
> > > > > size to refer to the new initrd image)
> > > > This way seems practicable, but we don't know how to handle it in
> > > > kexec-tools. :(
> > > >
> > >
> > > Good point. Let me think a bit about this.
> > OK, then let's go back to this series itself. :)
> >
> > I have seen the latest code in your git repo, I don't think we need
> > efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as
> > before seems reasonable.
> >
>
> No, not really. EFI_BOOT basically means 'did I boot via EFI?' and
> strictly, you should not be attempting to access the EFI system table
> or parse EFI config tables unless EFI_BOOT is true.
>
> Deviating from this makes it more difficult for generic code to reason
> about what parts of EFI are active on a given system.
>
> > If efi_novamap breaks something, I can accept "set efi_novamap to
> > false" in the previous version, or just ignore its value in the
> > efistub, but a0 should clearly be the indicator of EFI_BOOT.
> >
>
> I'm fine with keeping efi_novamap if you think it has a value. To me,
> it seems rather pointless, because it prevents you from using the
> runtime services.
>
> If you want a0 to control the EFI boot flag, you should find another
> way to control whether runtime services can be used, because they are
> really two different things.
I'm very sorry, after an offline discussion with my colleagues,
non-EFI DT boot is still needed (very sadly, we want to drop non-EFI
firmware but we can't do that). However, for non-EFI DT boot we will
use the same parameter passing method (a0=efi boot flag, a1=cmdline,
a2=systemtable), firmware will generate a simple systemtable only for
DT. In this way all boot methods share the same logic, and also make
kexec easy to implement.

So, let's make a0 the real "efi boot flag" and let it control
EFI_BOOT, for efistub, we can just pass "true" unconditionally
(whether support efi_novamap is not as important as the efi boot flag
for us, as you said, efi_novamap is just for broken firmware).

Huacai
Ard Biesheuvel Sept. 20, 2022, 8:04 a.m. UTC | #23
On Tue, 20 Sept 2022 at 03:45, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Ard,
>
...
> I'm very sorry, after an offline discussion with my colleagues,
> non-EFI DT boot is still needed (very sadly, we want to drop non-EFI
> firmware but we can't do that). However, for non-EFI DT boot we will
> use the same parameter passing method (a0=efi boot flag, a1=cmdline,
> a2=systemtable), firmware will generate a simple systemtable only for
> DT. In this way all boot methods share the same logic, and also make
> kexec easy to implement.
>

OK, that should work. So I suppose you create a EFI system table along
with EFI configuration tables for DT, SMBIOS, etc? In this case, I
suggest you omit the MEMMAP config table that I am adding here, so
that there is no ambiguity between the EFI provided memory map and the
one provided by DT.

I think that should be a clean way to implement this.

> So, let's make a0 the real "efi boot flag" and let it control
> EFI_BOOT, for efistub, we can just pass "true" unconditionally
> (whether support efi_novamap is not as important as the efi boot flag
> for us, as you said, efi_novamap is just for broken firmware).

Indeed. So as before, I will just set efi_novamap to false. You can
still use noefi or efi=noruntime to turn off the EFI runtime pieces if
needed (e.g., PREEMPT_RT tends to disable those by default to preserve
their bounded worst case latency)
Huacai Chen Sept. 20, 2022, 1:12 p.m. UTC | #24
Hi, Ard,

On Tue, Sep 20, 2022 at 4:04 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 20 Sept 2022 at 03:45, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Ard,
> >
> ...
> > I'm very sorry, after an offline discussion with my colleagues,
> > non-EFI DT boot is still needed (very sadly, we want to drop non-EFI
> > firmware but we can't do that). However, for non-EFI DT boot we will
> > use the same parameter passing method (a0=efi boot flag, a1=cmdline,
> > a2=systemtable), firmware will generate a simple systemtable only for
> > DT. In this way all boot methods share the same logic, and also make
> > kexec easy to implement.
> >
>
> OK, that should work. So I suppose you create a EFI system table along
> with EFI configuration tables for DT, SMBIOS, etc? In this case, I
> suggest you omit the MEMMAP config table that I am adding here, so
> that there is no ambiguity between the EFI provided memory map and the
> one provided by DT.
>
> I think that should be a clean way to implement this.
OK, thanks.

I have merged efi-cleanups-for-v6.1-v2 to
https://github.com/loongson/linux/commits/loongarch-next. It seems
everything work well except kexec.

Huacai
>
> > So, let's make a0 the real "efi boot flag" and let it control
> > EFI_BOOT, for efistub, we can just pass "true" unconditionally
> > (whether support efi_novamap is not as important as the efi boot flag
> > for us, as you said, efi_novamap is just for broken firmware).
>
> Indeed. So as before, I will just set efi_novamap to false. You can
> still use noefi or efi=noruntime to turn off the EFI runtime pieces if
> needed (e.g., PREEMPT_RT tends to disable those by default to preserve
> their bounded worst case latency)
Ard Biesheuvel Sept. 20, 2022, 2:53 p.m. UTC | #25
On Tue, 20 Sept 2022 at 15:12, Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Ard,
>
> On Tue, Sep 20, 2022 at 4:04 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Tue, 20 Sept 2022 at 03:45, Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Ard,
> > >
> > ...
> > > I'm very sorry, after an offline discussion with my colleagues,
> > > non-EFI DT boot is still needed (very sadly, we want to drop non-EFI
> > > firmware but we can't do that). However, for non-EFI DT boot we will
> > > use the same parameter passing method (a0=efi boot flag, a1=cmdline,
> > > a2=systemtable), firmware will generate a simple systemtable only for
> > > DT. In this way all boot methods share the same logic, and also make
> > > kexec easy to implement.
> > >
> >
> > OK, that should work. So I suppose you create a EFI system table along
> > with EFI configuration tables for DT, SMBIOS, etc? In this case, I
> > suggest you omit the MEMMAP config table that I am adding here, so
> > that there is no ambiguity between the EFI provided memory map and the
> > one provided by DT.
> >
> > I think that should be a clean way to implement this.
> OK, thanks.
>
> I have merged efi-cleanups-for-v6.1-v2 to
> https://github.com/loongson/linux/commits/loongarch-next. It seems
> everything work well except kexec.
>

OK thanks for testing. I will send out a v2 today and merge the
changes into efi/next tomorrow.
diff mbox series

Patch

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index fca106a8b8af..14a2a1ec8561 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -104,8 +104,6 @@  config LOONGARCH
 	select MODULES_USE_ELF_RELA if MODULES
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
 	select NEED_PER_CPU_PAGE_FIRST_CHUNK
-	select OF
-	select OF_EARLY_FLATTREE
 	select PCI
 	select PCI_DOMAINS_GENERIC
 	select PCI_ECAM if ACPI
@@ -311,7 +309,6 @@  config DMI
 config EFI
 	bool "EFI runtime service support"
 	select UCS2_STRING
-	select EFI_PARAMS_FROM_FDT
 	select EFI_RUNTIME_WRAPPERS
 	help
 	  This enables the kernel to use EFI runtime services that are
diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h
index e02ac4af7f6e..8e5881bc5ad1 100644
--- a/arch/loongarch/include/asm/bootinfo.h
+++ b/arch/loongarch/include/asm/bootinfo.h
@@ -36,7 +36,7 @@  struct loongson_system_configuration {
 };
 
 extern u64 efi_system_table;
-extern unsigned long fw_arg0, fw_arg1;
+extern unsigned long fw_arg0, fw_arg1, fw_arg2;
 extern struct loongson_board_info b_info;
 extern struct loongson_system_configuration loongson_sysconf;
 
diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
index 1f1f755fb425..3b80675726ec 100644
--- a/arch/loongarch/kernel/efi.c
+++ b/arch/loongarch/kernel/efi.c
@@ -27,8 +27,13 @@ 
 static unsigned long efi_nr_tables;
 static unsigned long efi_config_table;
 
+static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR;
+
 static efi_system_table_t *efi_systab;
-static efi_config_table_type_t arch_tables[] __initdata = {{},};
+static efi_config_table_type_t arch_tables[] __initdata = {
+	{LINUX_EFI_BOOT_MEMMAP_GUID,	&boot_memmap,	"MEMMAP" },
+	{},
+};
 
 void __init efi_runtime_init(void)
 {
@@ -61,6 +66,8 @@  void __init efi_init(void)
 		return;
 	}
 
+	efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor);
+
 	set_bit(EFI_64BIT, &efi.flags);
 	efi_nr_tables	 = efi_systab->nr_tables;
 	efi_config_table = (unsigned long)efi_systab->tables;
@@ -72,4 +79,25 @@  void __init efi_init(void)
 
 	if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI)
 		memblock_reserve(screen_info.lfb_base, screen_info.lfb_size);
+
+	if (boot_memmap != EFI_INVALID_TABLE_ADDR) {
+		struct efi_memory_map_data data;
+		struct efi_boot_memmap *tbl;
+
+		tbl = early_memremap_ro(boot_memmap, sizeof(*tbl));
+		if (tbl) {
+			data.phys_map		= boot_memmap + sizeof(*tbl);
+			data.size		= tbl->map_size;
+			data.desc_size		= tbl->desc_size;
+			data.desc_version	= tbl->desc_ver;
+
+			if (efi_memmap_init_early(&data) < 0)
+				panic("Unable to map EFI memory map.\n");
+
+			memblock_reserve(data.phys_map & PAGE_MASK,
+					 PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
+
+			early_memunmap(tbl, sizeof(*tbl));
+		}
+	}
 }
diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c
index 82b478a5c665..05c38d28476e 100644
--- a/arch/loongarch/kernel/env.c
+++ b/arch/loongarch/kernel/env.c
@@ -8,7 +8,6 @@ 
 #include <linux/efi.h>
 #include <linux/export.h>
 #include <linux/memblock.h>
-#include <linux/of_fdt.h>
 #include <asm/early_ioremap.h>
 #include <asm/bootinfo.h>
 #include <asm/loongson.h>
@@ -20,21 +19,18 @@  EXPORT_SYMBOL(loongson_sysconf);
 void __init init_environ(void)
 {
 	int efi_boot = fw_arg0;
-	struct efi_memory_map_data data;
-	void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K);
 
-	if (efi_boot)
-		set_bit(EFI_BOOT, &efi.flags);
-	else
-		clear_bit(EFI_BOOT, &efi.flags);
+	if (efi_boot) {
+		char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE);
 
-	early_init_dt_scan(fdt_ptr);
-	early_init_fdt_reserve_self();
-	efi_system_table = efi_get_fdt_params(&data);
+		strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE);
+		early_memunmap(cmdline, COMMAND_LINE_SIZE);
 
-	efi_memmap_init_early(&data);
-	memblock_reserve(data.phys_map & PAGE_MASK,
-			 PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
+		efi_system_table = fw_arg1;
+		set_bit(EFI_BOOT, &efi.flags);
+	} else {
+		clear_bit(EFI_BOOT, &efi.flags);
+	}
 }
 
 static int __init init_cpu_fullname(void)
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index 01bac62a6442..8f89f39fd31b 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -67,6 +67,8 @@  SYM_CODE_START(kernel_entry)			# kernel entry point
 	st.d		a0, t0, 0		# firmware arguments
 	la		t0, fw_arg1
 	st.d		a1, t0, 0
+	la		t0, fw_arg2
+	st.d		a2, t0, 0
 
 	/* KSave3 used for percpu base, initialized as 0 */
 	csrwr		zero, PERCPU_BASE_KS
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index e8714b1d94c8..7fabf2306e80 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -51,7 +51,7 @@ 
 
 struct screen_info screen_info __section(".data");
 
-unsigned long fw_arg0, fw_arg1;
+unsigned long fw_arg0, fw_arg1, fw_arg2;
 DEFINE_PER_CPU(unsigned long, kernelsp);
 struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly;
 
@@ -187,7 +187,6 @@  early_param("mem", early_parse_mem);
 
 void __init platform_init(void)
 {
-	efi_init();
 #ifdef CONFIG_ACPI_TABLE_UPGRADE
 	acpi_table_upgrade();
 #endif
@@ -347,6 +346,7 @@  void __init setup_arch(char **cmdline_p)
 	*cmdline_p = boot_command_line;
 
 	init_environ();
+	efi_init();
 	memblock_init();
 	parse_early_param();
 
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 0dbc6d93f2e6..d8d6657e6277 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -29,7 +29,7 @@  cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 cflags-$(CONFIG_LOONGARCH)	:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fpie
 
-cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
+cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)	+= -I$(srctree)/scripts/dtc/libfdt
 
 KBUILD_CFLAGS			:= $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
 				   -include $(srctree)/include/linux/hidden.h \
@@ -60,14 +60,17 @@  lib-y				:= efi-stub-helper.o gop.o secureboot.o tpm.o \
 				   alignedmem.o relocate.o vsprintf.o \
 				   systable.o
 
-# include the stub's generic dependencies from lib/ when building for ARM/arm64
-efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
+# include the stub's libfdt dependencies from lib/ when needed
+libfdt-deps			:= fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \
+				   fdt_empty_tree.c fdt_sw.c
+
+lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \
+				     $(patsubst %.c,lib-%.o,$(libfdt-deps))
 
 $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
 	$(call if_changed_rule,cc_o_c)
 
-lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o fdt.o string.o intrinsics.o \
-				   $(patsubst %.c,lib-%.o,$(efi-deps-y))
+lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o string.o intrinsics.o
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
 lib-$(CONFIG_ARM64)		+= arm64-stub.o
diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
index b7ef8d2df59e..7c684d10f936 100644
--- a/drivers/firmware/efi/libstub/loongarch-stub.c
+++ b/drivers/firmware/efi/libstub/loongarch-stub.c
@@ -9,7 +9,8 @@ 
 #include <asm/addrspace.h>
 #include "efistub.h"
 
-typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt);
+typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab,
+					  unsigned long cmdline);
 
 extern int kernel_asize;
 extern int kernel_fsize;
@@ -42,19 +43,60 @@  efi_status_t handle_kernel_image(unsigned long *image_addr,
 	return status;
 }
 
-void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size)
+struct exit_boot_struct {
+	efi_memory_desc_t	*runtime_map;
+	int			runtime_entry_count;
+};
+
+static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
+{
+	struct exit_boot_struct *p = priv;
+
+	/*
+	 * Update the memory map with virtual addresses. The function will also
+	 * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME
+	 * entries so that we can pass it straight to SetVirtualAddressMap()
+	 */
+	efi_get_virtmap(map->map, map->map_size, map->desc_size,
+			p->runtime_map, &p->runtime_entry_count);
+
+	return EFI_SUCCESS;
+}
+
+efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image,
+			     unsigned long image_addr, char *cmdline_ptr)
 {
 	kernel_entry_t real_kernel_entry;
+	struct exit_boot_struct priv;
+	unsigned long desc_size;
+	efi_status_t status;
+	u32 desc_ver;
+
+	status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver);
+	if (status != EFI_SUCCESS) {
+		efi_err("Unable to retrieve UEFI memory map.\n");
+		return status;
+	}
+
+	efi_info("Exiting boot services\n");
+
+	efi_novamap = false;
+	status = efi_exit_boot_services(handle, &priv, exit_boot_func);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	/* Install the new virtual address map */
+	efi_rt_call(set_virtual_address_map,
+		    priv.runtime_entry_count * desc_size, desc_size,
+		    desc_ver, priv.runtime_map);
 
 	/* Config Direct Mapping */
 	csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
 	csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
 
 	real_kernel_entry = (kernel_entry_t)
-		((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS);
+		((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS);
 
-	if (!efi_novamap)
-		real_kernel_entry(true, fdt);
-	else
-		real_kernel_entry(false, fdt);
+	real_kernel_entry(true, (unsigned long)efi_system_table,
+			  (unsigned long)cmdline_ptr);
 }