diff mbox series

[v3] Support ESRT in Xen dom0

Message ID 20220919193257.2031-1-demi@invisiblethingslab.com
State New
Headers show
Series [v3] Support ESRT in Xen dom0 | expand

Commit Message

Demi Marie Obenour Sept. 19, 2022, 7:32 p.m. UTC
fwupd requires access to the EFI System Resource Table (ESRT) to
discover which firmware can be updated by the OS.  Currently, Linux does
not expose the ESRT when running as a Xen dom0.  Therefore, it is not
possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
Qubes OS.

Before Xen 4.16, this was not fixable due to hypervisor limitations.
The UEFI specification requires the ESRT to be in EfiBootServicesData
memory, which Xen will use for whatever purposes it likes.  Therefore,
Linux cannot safely access the ESRT, as Xen may have overwritten it.

Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
the ESRT to it, and finally replaces the ESRT pointer with a pointer to
the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
this ensures that the ESRT can safely be accessed by the OS.  It is safe
to access the ESRT under Xen if, and only if, it is in memory of type
EfiRuntimeServicesData.

When running as a Xen dom0, check if the ESRT is in memory of type
EfiRuntimeServicesData, and if it is, parse it as if not running under
Xen.  This allows programs such as fwupd which require the ESRT to run
under Xen, and so makes fwupd support in Qubes OS possible.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v2:

- Massively updated commit message.
- Fetch the ESRT inline in drivers/firmware/efi/esrt.c, instead of using
  a single-use helper in drivers/xen/efi.c.

Changes since v1:

- Use a different type (struct xen_efi_mem_info) for memory information
  provided by Xen, as Xen reports it in a different way than the
  standard Linux functions do.

 drivers/firmware/efi/esrt.c | 71 ++++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 13 deletions(-)

Comments

Ard Biesheuvel Sept. 20, 2022, 3:36 p.m. UTC | #1
Hello Demi,

On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
<demi@invisiblethingslab.com> wrote:
>
> fwupd requires access to the EFI System Resource Table (ESRT) to
> discover which firmware can be updated by the OS.  Currently, Linux does
> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> Qubes OS.
>
> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> The UEFI specification requires the ESRT to be in EfiBootServicesData
> memory, which Xen will use for whatever purposes it likes.  Therefore,
> Linux cannot safely access the ESRT, as Xen may have overwritten it.
>
> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> to access the ESRT under Xen if, and only if, it is in memory of type
> EfiRuntimeServicesData.
>

Thanks for the elaborate explanation. This is really helpful.

So here, you are explaining that the only way for Xen to prevent
itself from potentially clobbering the ESRT is by creating a
completely new allocation? What about other assets that may be passed
via EFI boot services data regions?

So first of all, EfiRuntimeServicesData has a special purpose: it is
used to carry data that is part of the EFI runtime service
implementations themselves. Therefore, it has to be mapped into the
EFI page tables by the OS kernel, and carved out of the linear map (to
prevent inadvertent access with mismatched attributes). So unless you
are writing the code that backs GetVariable() or SetVariable(), there
are never good reasons to use EfiRuntimeServicesData.

If you want to use a memory type that is suitable for firmware tables
that are intended for consumption by the OS only (and not by the
runtime services themselves), you might consider EfiAcpiReclaimMemory.

TBH I still don't think this is a scalable approach. There are other
configuration tables that may be passed in EFI boot services memory,
and MS especially were pushing back in the UEFI forum on adding table
types that were passed in anything other the EfiBootServicesData.

> When running as a Xen dom0, check if the ESRT is in memory of type
> EfiRuntimeServicesData, and if it is, parse it as if not running under
> Xen.  This allows programs such as fwupd which require the ESRT to run
> under Xen, and so makes fwupd support in Qubes OS possible.
>
> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
> ---
> Changes since v2:
>
> - Massively updated commit message.
> - Fetch the ESRT inline in drivers/firmware/efi/esrt.c, instead of using
>   a single-use helper in drivers/xen/efi.c.
>
> Changes since v1:
>
> - Use a different type (struct xen_efi_mem_info) for memory information
>   provided by Xen, as Xen reports it in a different way than the
>   standard Linux functions do.
>
>  drivers/firmware/efi/esrt.c | 71 ++++++++++++++++++++++++++++++-------
>  1 file changed, 58 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
> index 2a2f52b017e736dd995c69e8aeb5fbd7761732e5..378bf2ea770ad3bd747345a89258216919eb21bb 100644
> --- a/drivers/firmware/efi/esrt.c
> +++ b/drivers/firmware/efi/esrt.c
> @@ -28,6 +28,11 @@
>  #include <asm/io.h>
>  #include <asm/early_ioremap.h>
>
> +#ifdef CONFIG_XEN_EFI
> +#include <asm/xen/hypercall.h>
> +#include <xen/page.h>
> +#endif
> +
>  struct efi_system_resource_entry_v1 {
>         efi_guid_t      fw_class;
>         u32             fw_type;
> @@ -243,27 +248,67 @@ void __init efi_esrt_init(void)
>         void *va;
>         struct efi_system_resource_table tmpesrt;
>         size_t size, max, entry_size, entries_size;
> -       efi_memory_desc_t md;
> -       int rc;
>         phys_addr_t end;
> -
> -       if (!efi_enabled(EFI_MEMMAP))
> -               return;
> +       uint32_t type;
>
>         pr_debug("esrt-init: loading.\n");
>         if (!esrt_table_exists())
>                 return;
>
> -       rc = efi_mem_desc_lookup(efi.esrt, &md);
> -       if (rc < 0 ||
> -           (!(md.attribute & EFI_MEMORY_RUNTIME) &&
> -            md.type != EFI_BOOT_SERVICES_DATA &&
> -            md.type != EFI_RUNTIME_SERVICES_DATA)) {
> -               pr_warn("ESRT header is not in the memory map.\n");
> +       if (efi_enabled(EFI_MEMMAP)) {
> +               efi_memory_desc_t md;
> +
> +               if (efi_mem_desc_lookup(efi.esrt, &md) < 0 ||
> +                   (!(md.attribute & EFI_MEMORY_RUNTIME) &&
> +                    md.type != EFI_BOOT_SERVICES_DATA &&
> +                    md.type != EFI_RUNTIME_SERVICES_DATA)) {
> +                       pr_warn("ESRT header is not in the memory map.\n");
> +                       return;
> +               }
> +
> +               type = md.type;
> +               max = efi_mem_desc_end(&md);
> +#ifdef CONFIG_XEN_EFI
> +       } else if (efi_enabled(EFI_PARAVIRT)) {
> +               static_assert(XEN_PAGE_SHIFT == EFI_PAGE_SHIFT,
> +                             "Mismatch between EFI_PAGE_SHIFT and XEN_PAGE_SHIFT");
> +
> +               struct xen_platform_op op = {
> +                       .cmd = XENPF_firmware_info,
> +                       .u.firmware_info = {
> +                               .type = XEN_FW_EFI_INFO,
> +                               .index = XEN_FW_EFI_MEM_INFO,
> +                               .u.efi_info.mem.addr = efi.esrt,
> +                               .u.efi_info.mem.size = ((u64)-1ULL) - efi.esrt,
> +                       }
> +               };
> +               union xenpf_efi_info *info = &op.u.firmware_info.u.efi_info;
> +               int rc = HYPERVISOR_platform_op(&op);
> +
> +               if (rc) {
> +                       pr_warn("Failed to lookup ESRT header %lu in Xen memory map: error %d\n",
> +                               efi.esrt, rc);
> +                       return;
> +               }
> +               type = info->mem.type;
> +               max = info->mem.addr + info->mem.size;
> +
> +               /*
> +                * Recent Xen versions relocate the ESRT to memory of type
> +                * EfiRuntimeServicesData, which Xen will not reuse.  If the ESRT
> +                * is not in EfiRuntimeServicesData memory, it has not been reserved
> +                * by Xen and might be allocated to other guests, so it cannot
> +                * safely be used.
> +                */
> +               if (type != EFI_RUNTIME_SERVICES_DATA) {
> +                       pr_warn("Xen did not reserve ESRT, ignoring it\n");
> +                       return;
> +               }
> +#endif

I am really not happy with this. You are adding a special case
specific to Xen to double check that it has violated the EFI spec as
required. Even if some firmwares exist that do the same, codifying it
like this on mainline Linux code is not something I am comfortable
accepting.

I take it that this also means that ESRT on dom0 is currently just
broken, right?


> +       } else {
>                 return;
>         }
>
> -       max = efi_mem_desc_end(&md);
>         if (max < efi.esrt) {
>                 pr_err("EFI memory descriptor is invalid. (esrt: %p max: %p)\n",
>                        (void *)efi.esrt, (void *)max);
> @@ -333,7 +378,7 @@ void __init efi_esrt_init(void)
>
>         end = esrt_data + size;
>         pr_info("Reserving ESRT space from %pa to %pa.\n", &esrt_data, &end);
> -       if (md.type == EFI_BOOT_SERVICES_DATA)
> +       if (type == EFI_BOOT_SERVICES_DATA)
>                 efi_mem_reserve(esrt_data, esrt_data_size);
>
>         pr_debug("esrt-init: loaded.\n");
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
> Invisible Things Lab
Jan Beulich Sept. 20, 2022, 3:54 p.m. UTC | #2
On 20.09.2022 17:36, Ard Biesheuvel wrote:
> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> <demi@invisiblethingslab.com> wrote:
>>
>> fwupd requires access to the EFI System Resource Table (ESRT) to
>> discover which firmware can be updated by the OS.  Currently, Linux does
>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
>> Qubes OS.
>>
>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
>> The UEFI specification requires the ESRT to be in EfiBootServicesData
>> memory, which Xen will use for whatever purposes it likes.  Therefore,
>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
>>
>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
>> to access the ESRT under Xen if, and only if, it is in memory of type
>> EfiRuntimeServicesData.
>>
> 
> Thanks for the elaborate explanation. This is really helpful.
> 
> So here, you are explaining that the only way for Xen to prevent
> itself from potentially clobbering the ESRT is by creating a
> completely new allocation?

There are surely other ways, e.g. preserving BootServices* regions
alongside RuntimeServices* ones. But as the maintainer of the EFI
code in Xen I don't view this as a reasonable approach.

> What about other assets that may be passed
> via EFI boot services data regions?

These would need handling similarly then.

> So first of all, EfiRuntimeServicesData has a special purpose: it is
> used to carry data that is part of the EFI runtime service
> implementations themselves. Therefore, it has to be mapped into the
> EFI page tables by the OS kernel, and carved out of the linear map (to
> prevent inadvertent access with mismatched attributes). So unless you
> are writing the code that backs GetVariable() or SetVariable(), there
> are never good reasons to use EfiRuntimeServicesData.

That's a rather strict interpretation of the spec. Even back when
I started dealing with EFI, when it was still quite new, I know
RuntimeServices* was already used for similar purposes.

> If you want to use a memory type that is suitable for firmware tables
> that are intended for consumption by the OS only (and not by the
> runtime services themselves), you might consider EfiAcpiReclaimMemory.

Personally I consider this type less appropriate than the one we
currently use. It's intended to be used by ACPI, which doesn't
come into play here. It comes quite close to using e.g.
EfiUnusableMemory here ... We might be able to (ab)use
EfiLoaderData for this, but that would again require special
casing (inside Xen) when deciding whether the memory can be used
as general-purpose memory.

> TBH I still don't think this is a scalable approach. There are other
> configuration tables that may be passed in EFI boot services memory,
> and MS especially were pushing back in the UEFI forum on adding table
> types that were passed in anything other the EfiBootServicesData.

Within Xen we might abstract the approach currently implemented in
case more such pieces of data appear.

While I can easily believe MS might be advocating for this model,
I view it as problematic not only for Xen. How would you pass on
this information across kexec, for example, without introducing
further producer-consumer dependencies requiring separate protocols
to be followed?

Jan
Ard Biesheuvel Sept. 20, 2022, 4:09 p.m. UTC | #3
On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
>
> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > <demi@invisiblethingslab.com> wrote:
> >>
> >> fwupd requires access to the EFI System Resource Table (ESRT) to
> >> discover which firmware can be updated by the OS.  Currently, Linux does
> >> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> >> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> >> Qubes OS.
> >>
> >> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> >> The UEFI specification requires the ESRT to be in EfiBootServicesData
> >> memory, which Xen will use for whatever purposes it likes.  Therefore,
> >> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> >>
> >> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> >> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> >> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> >> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> >> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> >> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> >> to access the ESRT under Xen if, and only if, it is in memory of type
> >> EfiRuntimeServicesData.
> >>
> >
> > Thanks for the elaborate explanation. This is really helpful.
> >
> > So here, you are explaining that the only way for Xen to prevent
> > itself from potentially clobbering the ESRT is by creating a
> > completely new allocation?
>
> There are surely other ways, e.g. preserving BootServices* regions
> alongside RuntimeServices* ones. But as the maintainer of the EFI
> code in Xen I don't view this as a reasonable approach.
>

Why not?

> > What about other assets that may be passed
> > via EFI boot services data regions?
>
> These would need handling similarly then.
>
> > So first of all, EfiRuntimeServicesData has a special purpose: it is
> > used to carry data that is part of the EFI runtime service
> > implementations themselves. Therefore, it has to be mapped into the
> > EFI page tables by the OS kernel, and carved out of the linear map (to
> > prevent inadvertent access with mismatched attributes). So unless you
> > are writing the code that backs GetVariable() or SetVariable(), there
> > are never good reasons to use EfiRuntimeServicesData.
>
> That's a rather strict interpretation of the spec. Even back when
> I started dealing with EFI, when it was still quite new, I know
> RuntimeServices* was already used for similar purposes.
>

I'm not saying it is never used inappropriately. But using a memory
type that gets mapped into the runtime services page tables every time
a runtime service call is made is pointless, fragments both the EFI
page tables as well as the kernel direct map for no reason.

> > If you want to use a memory type that is suitable for firmware tables
> > that are intended for consumption by the OS only (and not by the
> > runtime services themselves), you might consider EfiAcpiReclaimMemory.
>
> Personally I consider this type less appropriate than the one we
> currently use. It's intended to be used by ACPI, which doesn't
> come into play here.

In spite of the name, that is not really true. It is reclaimable
memory, which means that the OS can use the memory as conventional
memory after consuming its contents, or discarding them.

> It comes quite close to using e.g.
> EfiUnusableMemory here ...

No, that is something completely different. Using unusable memory for
anything would be silly.

> We might be able to (ab)use
> EfiLoaderData for this, but that would again require special
> casing (inside Xen) when deciding whether the memory can be used
> as general-purpose memory.
>

EFI loader data and EFI boot services data are the exact same thing
from this POV. They have no significance to the runtime firmware, and
can be used as ordinary available memory after ExitBootServices().

> > TBH I still don't think this is a scalable approach. There are other
> > configuration tables that may be passed in EFI boot services memory,
> > and MS especially were pushing back in the UEFI forum on adding table
> > types that were passed in anything other the EfiBootServicesData.
>
> Within Xen we might abstract the approach currently implemented in
> case more such pieces of data appear.
>
> While I can easily believe MS might be advocating for this model,
> I view it as problematic not only for Xen. How would you pass on
> this information across kexec, for example, without introducing
> further producer-consumer dependencies requiring separate protocols
> to be followed?
>

In this case, I don't think this is unreasonable for configuration
tables, which only have a GUID and a base address. If the OS knows the
GUID, and knows how to interpret the contents, it can decide for
itself whether or not to preserve it. If it doesn't know the GUID, the
memory is just treated as available memory [after EBS()]

I personally think reclaimable memory is more suitable for these
cases, which is why I am willing to consider that as well. Note that
the EFI spec now also mandates device trees on ARM to be passed via
EfiAcpiReclaimMemory, simply because it is the memory type suitable
for firmware tables that only the OS consumes.
Jan Beulich Sept. 21, 2022, 8:34 p.m. UTC | #4
On 20.09.2022 18:09, Ard Biesheuvel wrote:
> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
>>> <demi@invisiblethingslab.com> wrote:
>>>>
>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
>>>> discover which firmware can be updated by the OS.  Currently, Linux does
>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
>>>> Qubes OS.
>>>>
>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
>>>>
>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
>>>> to access the ESRT under Xen if, and only if, it is in memory of type
>>>> EfiRuntimeServicesData.
>>>>
>>>
>>> Thanks for the elaborate explanation. This is really helpful.
>>>
>>> So here, you are explaining that the only way for Xen to prevent
>>> itself from potentially clobbering the ESRT is by creating a
>>> completely new allocation?
>>
>> There are surely other ways, e.g. preserving BootServices* regions
>> alongside RuntimeServices* ones. But as the maintainer of the EFI
>> code in Xen I don't view this as a reasonable approach.
> 
> Why not?

Because it's against the intentions the EFI has (or at least had)
for this memory type. Much more than EfiAcpiReclaimMemory this
type is intended for use as ordinary RAM post-boot.

>>> TBH I still don't think this is a scalable approach. There are other
>>> configuration tables that may be passed in EFI boot services memory,
>>> and MS especially were pushing back in the UEFI forum on adding table
>>> types that were passed in anything other the EfiBootServicesData.
>>
>> Within Xen we might abstract the approach currently implemented in
>> case more such pieces of data appear.
>>
>> While I can easily believe MS might be advocating for this model,
>> I view it as problematic not only for Xen. How would you pass on
>> this information across kexec, for example, without introducing
>> further producer-consumer dependencies requiring separate protocols
>> to be followed?
>>
> 
> In this case, I don't think this is unreasonable for configuration
> tables, which only have a GUID and a base address. If the OS knows the
> GUID, and knows how to interpret the contents, it can decide for
> itself whether or not to preserve it. If it doesn't know the GUID, the
> memory is just treated as available memory [after EBS()]
> 
> I personally think reclaimable memory is more suitable for these
> cases, which is why I am willing to consider that as well. Note that
> the EFI spec now also mandates device trees on ARM to be passed via
> EfiAcpiReclaimMemory, simply because it is the memory type suitable
> for firmware tables that only the OS consumes.

We do preserve EfiAcpiReclaimMemory, for the simple reason that with
Xen "the OS" is ambiguous: Is that Xen or Dom0? Most of ACPI is
handled by Dom0, so we can't very well discard the data before Dom0
starts. (This then also matters for what you've said in the earlier
paragraph. In particular the sets of known GUIDs may be dissimilar
for Xen and the Dom0 kernel. Considering your other remark about
fragmentation you might agree that preserving in-place is not very
desirable.)

Especially with DT mandated to use EfiAcpiReclaimMemory I'm willing
to consider using that type for the storing of ESRT (and whatever
else may appear along these lines). Demi, you may want to check for
both types in your Linux side patch ...

Jan
Demi Marie Obenour Sept. 22, 2022, 1:09 a.m. UTC | #5
On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> > On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> >>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> >>> <demi@invisiblethingslab.com> wrote:
> >>>>
> >>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> >>>> discover which firmware can be updated by the OS.  Currently, Linux does
> >>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> >>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> >>>> Qubes OS.
> >>>>
> >>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> >>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> >>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> >>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> >>>>
> >>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> >>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> >>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> >>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> >>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> >>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> >>>> to access the ESRT under Xen if, and only if, it is in memory of type
> >>>> EfiRuntimeServicesData.
> >>>>
> >>>
> >>> Thanks for the elaborate explanation. This is really helpful.
> >>>
> >>> So here, you are explaining that the only way for Xen to prevent
> >>> itself from potentially clobbering the ESRT is by creating a
> >>> completely new allocation?
> >>
> >> There are surely other ways, e.g. preserving BootServices* regions
> >> alongside RuntimeServices* ones. But as the maintainer of the EFI
> >> code in Xen I don't view this as a reasonable approach.
> > 
> > Why not?
> 
> Because it's against the intentions the EFI has (or at least had)
> for this memory type. Much more than EfiAcpiReclaimMemory this
> type is intended for use as ordinary RAM post-boot.

What about giving that memory to dom0?  dom0’s balloon driver will give
anything dom0 doesn’t wind up using back to Xen.

> >>> TBH I still don't think this is a scalable approach. There are other
> >>> configuration tables that may be passed in EFI boot services memory,
> >>> and MS especially were pushing back in the UEFI forum on adding table
> >>> types that were passed in anything other the EfiBootServicesData.
> >>
> >> Within Xen we might abstract the approach currently implemented in
> >> case more such pieces of data appear.
> >>
> >> While I can easily believe MS might be advocating for this model,
> >> I view it as problematic not only for Xen. How would you pass on
> >> this information across kexec, for example, without introducing
> >> further producer-consumer dependencies requiring separate protocols
> >> to be followed?
> >>
> > 
> > In this case, I don't think this is unreasonable for configuration
> > tables, which only have a GUID and a base address. If the OS knows the
> > GUID, and knows how to interpret the contents, it can decide for
> > itself whether or not to preserve it. If it doesn't know the GUID, the
> > memory is just treated as available memory [after EBS()]
> > 
> > I personally think reclaimable memory is more suitable for these
> > cases, which is why I am willing to consider that as well. Note that
> > the EFI spec now also mandates device trees on ARM to be passed via
> > EfiAcpiReclaimMemory, simply because it is the memory type suitable
> > for firmware tables that only the OS consumes.
> 
> We do preserve EfiAcpiReclaimMemory, for the simple reason that with
> Xen "the OS" is ambiguous: Is that Xen or Dom0? Most of ACPI is
> handled by Dom0, so we can't very well discard the data before Dom0
> starts. (This then also matters for what you've said in the earlier
> paragraph. In particular the sets of known GUIDs may be dissimilar
> for Xen and the Dom0 kernel. Considering your other remark about
> fragmentation you might agree that preserving in-place is not very
> desirable.)
> 
> Especially with DT mandated to use EfiAcpiReclaimMemory I'm willing
> to consider using that type for the storing of ESRT (and whatever
> else may appear along these lines). Demi, you may want to check for
> both types in your Linux side patch ...

EfiAcpiReclaimMemory does seem like a better choice.
Demi Marie Obenour Sept. 22, 2022, 1:53 a.m. UTC | #6
On Tue, Sep 20, 2022 at 06:09:49PM +0200, Ard Biesheuvel wrote:
> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> >
> > On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > > On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > > <demi@invisiblethingslab.com> wrote:
> > >>
> > >> fwupd requires access to the EFI System Resource Table (ESRT) to
> > >> discover which firmware can be updated by the OS.  Currently, Linux does
> > >> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> > >> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> > >> Qubes OS.
> > >>
> > >> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> > >> The UEFI specification requires the ESRT to be in EfiBootServicesData
> > >> memory, which Xen will use for whatever purposes it likes.  Therefore,
> > >> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> > >>
> > >> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> > >> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> > >> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> > >> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> > >> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> > >> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> > >> to access the ESRT under Xen if, and only if, it is in memory of type
> > >> EfiRuntimeServicesData.
> > >>
> > > TBH I still don't think this is a scalable approach. There are other
> > > configuration tables that may be passed in EFI boot services memory,
> > > and MS especially were pushing back in the UEFI forum on adding table
> > > types that were passed in anything other the EfiBootServicesData.
> >
> > Within Xen we might abstract the approach currently implemented in
> > case more such pieces of data appear.
> >
> > While I can easily believe MS might be advocating for this model,
> > I view it as problematic not only for Xen. How would you pass on
> > this information across kexec, for example, without introducing
> > further producer-consumer dependencies requiring separate protocols
> > to be followed?
> >
> 
> In this case, I don't think this is unreasonable for configuration
> tables, which only have a GUID and a base address. If the OS knows the
> GUID, and knows how to interpret the contents, it can decide for
> itself whether or not to preserve it. If it doesn't know the GUID, the
> memory is just treated as available memory [after EBS()]

Should an OS uninstall any configuration tables that it does not
preserve if it ever plans to kexec()?  Does Linux do this?
Jan Beulich Sept. 22, 2022, 6:12 a.m. UTC | #7
On 22.09.2022 03:09, Demi Marie Obenour wrote:
> On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
>> On 20.09.2022 18:09, Ard Biesheuvel wrote:
>>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
>>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
>>>>> <demi@invisiblethingslab.com> wrote:
>>>>>>
>>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
>>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
>>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
>>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
>>>>>> Qubes OS.
>>>>>>
>>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
>>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
>>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
>>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
>>>>>>
>>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
>>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
>>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
>>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
>>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
>>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
>>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
>>>>>> EfiRuntimeServicesData.
>>>>>>
>>>>>
>>>>> Thanks for the elaborate explanation. This is really helpful.
>>>>>
>>>>> So here, you are explaining that the only way for Xen to prevent
>>>>> itself from potentially clobbering the ESRT is by creating a
>>>>> completely new allocation?
>>>>
>>>> There are surely other ways, e.g. preserving BootServices* regions
>>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
>>>> code in Xen I don't view this as a reasonable approach.
>>>
>>> Why not?
>>
>> Because it's against the intentions the EFI has (or at least had)
>> for this memory type. Much more than EfiAcpiReclaimMemory this
>> type is intended for use as ordinary RAM post-boot.
> 
> What about giving that memory to dom0?  dom0’s balloon driver will give
> anything dom0 doesn’t wind up using back to Xen.

While perhaps in principle possible, this would require special casing
in Xen. Except for the memory the initrd comes in, we don't directly
hand memory to Dom0. Instead everything goes through the page allocator
first. Plus if we really were convinced boot services memory needed
retaining, then it would also need retaining across kexec (and hence
shouldn't be left to Dom0 to decide what to do with it).

Jan
Demi Marie Obenour Sept. 22, 2022, 2:55 p.m. UTC | #8
On Thu, Sep 22, 2022 at 08:12:14AM +0200, Jan Beulich wrote:
> On 22.09.2022 03:09, Demi Marie Obenour wrote:
> > On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> >> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> >>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> >>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> >>>>> <demi@invisiblethingslab.com> wrote:
> >>>>>>
> >>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> >>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
> >>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> >>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> >>>>>> Qubes OS.
> >>>>>>
> >>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> >>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> >>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> >>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> >>>>>>
> >>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> >>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> >>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> >>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> >>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> >>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> >>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
> >>>>>> EfiRuntimeServicesData.
> >>>>>>
> >>>>>
> >>>>> Thanks for the elaborate explanation. This is really helpful.
> >>>>>
> >>>>> So here, you are explaining that the only way for Xen to prevent
> >>>>> itself from potentially clobbering the ESRT is by creating a
> >>>>> completely new allocation?
> >>>>
> >>>> There are surely other ways, e.g. preserving BootServices* regions
> >>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
> >>>> code in Xen I don't view this as a reasonable approach.
> >>>
> >>> Why not?
> >>
> >> Because it's against the intentions the EFI has (or at least had)
> >> for this memory type. Much more than EfiAcpiReclaimMemory this
> >> type is intended for use as ordinary RAM post-boot.
> > 
> > What about giving that memory to dom0?  dom0’s balloon driver will give
> > anything dom0 doesn’t wind up using back to Xen.
> 
> While perhaps in principle possible, this would require special casing
> in Xen. Except for the memory the initrd comes in, we don't directly
> hand memory to Dom0. Instead everything goes through the page allocator
> first. Plus if we really were convinced boot services memory needed
> retaining, then it would also need retaining across kexec (and hence
> shouldn't be left to Dom0 to decide what to do with it).

So how should dom0 handle the various EFI tables other than the ESRT?
Right now most uses of these tables in Linux are not guarded by any
checks for efi_enabled(EFI_MEMMAP) or similar.  If some of them are in
EfiBootServicesData memory, they might be corrupted before Linux gets
them.
Ard Biesheuvel Sept. 22, 2022, 3:05 p.m. UTC | #9
On Thu, 22 Sept 2022 at 16:56, Demi Marie Obenour
<demi@invisiblethingslab.com> wrote:
>
> On Thu, Sep 22, 2022 at 08:12:14AM +0200, Jan Beulich wrote:
> > On 22.09.2022 03:09, Demi Marie Obenour wrote:
> > > On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> > >> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> > >>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> > >>>>
> > >>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > >>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > >>>>> <demi@invisiblethingslab.com> wrote:
> > >>>>>>
> > >>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> > >>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
> > >>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> > >>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> > >>>>>> Qubes OS.
> > >>>>>>
> > >>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> > >>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> > >>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> > >>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> > >>>>>>
> > >>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> > >>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> > >>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> > >>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> > >>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> > >>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> > >>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
> > >>>>>> EfiRuntimeServicesData.
> > >>>>>>
> > >>>>>
> > >>>>> Thanks for the elaborate explanation. This is really helpful.
> > >>>>>
> > >>>>> So here, you are explaining that the only way for Xen to prevent
> > >>>>> itself from potentially clobbering the ESRT is by creating a
> > >>>>> completely new allocation?
> > >>>>
> > >>>> There are surely other ways, e.g. preserving BootServices* regions
> > >>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
> > >>>> code in Xen I don't view this as a reasonable approach.
> > >>>
> > >>> Why not?
> > >>
> > >> Because it's against the intentions the EFI has (or at least had)
> > >> for this memory type. Much more than EfiAcpiReclaimMemory this
> > >> type is intended for use as ordinary RAM post-boot.
> > >
> > > What about giving that memory to dom0?  dom0’s balloon driver will give
> > > anything dom0 doesn’t wind up using back to Xen.
> >
> > While perhaps in principle possible, this would require special casing
> > in Xen. Except for the memory the initrd comes in, we don't directly
> > hand memory to Dom0. Instead everything goes through the page allocator
> > first. Plus if we really were convinced boot services memory needed
> > retaining, then it would also need retaining across kexec (and hence
> > shouldn't be left to Dom0 to decide what to do with it).
>
> So how should dom0 handle the various EFI tables other than the ESRT?
> Right now most uses of these tables in Linux are not guarded by any
> checks for efi_enabled(EFI_MEMMAP) or similar.  If some of them are in
> EfiBootServicesData memory, they might be corrupted before Linux gets
> them.

Yes, this is an annoying oversight of the EFI design: the config
tables are <guid, address> tuples, and the size of the table is
specific to each table type. So without knowing the GUID, there is no
way you can reserve the right size.

Perhaps you could implement something like a hypercall in
efi_arch_mem_reserve(), which is called by the EFI code to reserve
regions that are in boot services memory but need to remain reserved?
This is used for all config tables that it knows or cares about.
Demi Marie Obenour Sept. 22, 2022, 6:11 p.m. UTC | #10
On Thu, Sep 22, 2022 at 05:05:43PM +0200, Ard Biesheuvel wrote:
> On Thu, 22 Sept 2022 at 16:56, Demi Marie Obenour
> <demi@invisiblethingslab.com> wrote:
> >
> > On Thu, Sep 22, 2022 at 08:12:14AM +0200, Jan Beulich wrote:
> > > On 22.09.2022 03:09, Demi Marie Obenour wrote:
> > > > On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> > > >> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> > > >>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> > > >>>>
> > > >>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > > >>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > > >>>>> <demi@invisiblethingslab.com> wrote:
> > > >>>>>>
> > > >>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> > > >>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
> > > >>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> > > >>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> > > >>>>>> Qubes OS.
> > > >>>>>>
> > > >>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> > > >>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> > > >>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> > > >>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> > > >>>>>>
> > > >>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> > > >>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> > > >>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> > > >>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> > > >>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> > > >>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> > > >>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
> > > >>>>>> EfiRuntimeServicesData.
> > > >>>>>>
> > > >>>>>
> > > >>>>> Thanks for the elaborate explanation. This is really helpful.
> > > >>>>>
> > > >>>>> So here, you are explaining that the only way for Xen to prevent
> > > >>>>> itself from potentially clobbering the ESRT is by creating a
> > > >>>>> completely new allocation?
> > > >>>>
> > > >>>> There are surely other ways, e.g. preserving BootServices* regions
> > > >>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
> > > >>>> code in Xen I don't view this as a reasonable approach.
> > > >>>
> > > >>> Why not?
> > > >>
> > > >> Because it's against the intentions the EFI has (or at least had)
> > > >> for this memory type. Much more than EfiAcpiReclaimMemory this
> > > >> type is intended for use as ordinary RAM post-boot.
> > > >
> > > > What about giving that memory to dom0?  dom0’s balloon driver will give
> > > > anything dom0 doesn’t wind up using back to Xen.
> > >
> > > While perhaps in principle possible, this would require special casing
> > > in Xen. Except for the memory the initrd comes in, we don't directly
> > > hand memory to Dom0. Instead everything goes through the page allocator
> > > first. Plus if we really were convinced boot services memory needed
> > > retaining, then it would also need retaining across kexec (and hence
> > > shouldn't be left to Dom0 to decide what to do with it).
> >
> > So how should dom0 handle the various EFI tables other than the ESRT?
> > Right now most uses of these tables in Linux are not guarded by any
> > checks for efi_enabled(EFI_MEMMAP) or similar.  If some of them are in
> > EfiBootServicesData memory, they might be corrupted before Linux gets
> > them.
> 
> Yes, this is an annoying oversight of the EFI design: the config
> tables are <guid, address> tuples, and the size of the table is
> specific to each table type. So without knowing the GUID, there is no
> way you can reserve the right size.
> 
> Perhaps you could implement something like a hypercall in
> efi_arch_mem_reserve(), which is called by the EFI code to reserve
> regions that are in boot services memory but need to remain reserved?
> This is used for all config tables that it knows or cares about.

On versions of Xen that support spawning multiple domains at boot
(instead of just dom0) this will be racy.  What about refusing to use
tables in EfiBootServicesData when running under Xen unless a hypercall
indicates that Xen has reserved all EfiBootServicesData memory?  Where
could such a check be placed?
Ard Biesheuvel Sept. 22, 2022, 10:14 p.m. UTC | #11
On Thu, 22 Sept 2022 at 20:12, Demi Marie Obenour
<demi@invisiblethingslab.com> wrote:
>
> On Thu, Sep 22, 2022 at 05:05:43PM +0200, Ard Biesheuvel wrote:
> > On Thu, 22 Sept 2022 at 16:56, Demi Marie Obenour
> > <demi@invisiblethingslab.com> wrote:
> > >
> > > On Thu, Sep 22, 2022 at 08:12:14AM +0200, Jan Beulich wrote:
> > > > On 22.09.2022 03:09, Demi Marie Obenour wrote:
> > > > > On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> > > > >> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> > > > >>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> > > > >>>>
> > > > >>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > > > >>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > > > >>>>> <demi@invisiblethingslab.com> wrote:
> > > > >>>>>>
> > > > >>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> > > > >>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
> > > > >>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> > > > >>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> > > > >>>>>> Qubes OS.
> > > > >>>>>>
> > > > >>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> > > > >>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> > > > >>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> > > > >>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> > > > >>>>>>
> > > > >>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> > > > >>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> > > > >>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> > > > >>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> > > > >>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> > > > >>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> > > > >>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
> > > > >>>>>> EfiRuntimeServicesData.
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>> Thanks for the elaborate explanation. This is really helpful.
> > > > >>>>>
> > > > >>>>> So here, you are explaining that the only way for Xen to prevent
> > > > >>>>> itself from potentially clobbering the ESRT is by creating a
> > > > >>>>> completely new allocation?
> > > > >>>>
> > > > >>>> There are surely other ways, e.g. preserving BootServices* regions
> > > > >>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
> > > > >>>> code in Xen I don't view this as a reasonable approach.
> > > > >>>
> > > > >>> Why not?
> > > > >>
> > > > >> Because it's against the intentions the EFI has (or at least had)
> > > > >> for this memory type. Much more than EfiAcpiReclaimMemory this
> > > > >> type is intended for use as ordinary RAM post-boot.
> > > > >
> > > > > What about giving that memory to dom0?  dom0’s balloon driver will give
> > > > > anything dom0 doesn’t wind up using back to Xen.
> > > >
> > > > While perhaps in principle possible, this would require special casing
> > > > in Xen. Except for the memory the initrd comes in, we don't directly
> > > > hand memory to Dom0. Instead everything goes through the page allocator
> > > > first. Plus if we really were convinced boot services memory needed
> > > > retaining, then it would also need retaining across kexec (and hence
> > > > shouldn't be left to Dom0 to decide what to do with it).
> > >
> > > So how should dom0 handle the various EFI tables other than the ESRT?
> > > Right now most uses of these tables in Linux are not guarded by any
> > > checks for efi_enabled(EFI_MEMMAP) or similar.  If some of them are in
> > > EfiBootServicesData memory, they might be corrupted before Linux gets
> > > them.
> >
> > Yes, this is an annoying oversight of the EFI design: the config
> > tables are <guid, address> tuples, and the size of the table is
> > specific to each table type. So without knowing the GUID, there is no
> > way you can reserve the right size.
> >
> > Perhaps you could implement something like a hypercall in
> > efi_arch_mem_reserve(), which is called by the EFI code to reserve
> > regions that are in boot services memory but need to remain reserved?
> > This is used for all config tables that it knows or cares about.
>
> On versions of Xen that support spawning multiple domains at boot
> (instead of just dom0) this will be racy.  What about refusing to use
> tables in EfiBootServicesData when running under Xen unless a hypercall
> indicates that Xen has reserved all EfiBootServicesData memory?  Where
> could such a check be placed?

You could stick a check inside the for loop in
efi_config_parse_tables() to cross reference every table address
against the memory map when running on Xen, and disregard it if it
doesn't meet your criteria.

I take it the issue is not limited to x86?
Demi Marie Obenour Sept. 22, 2022, 11:25 p.m. UTC | #12
On Fri, Sep 23, 2022 at 12:14:50AM +0200, Ard Biesheuvel wrote:
> On Thu, 22 Sept 2022 at 20:12, Demi Marie Obenour
> <demi@invisiblethingslab.com> wrote:
> >
> > On Thu, Sep 22, 2022 at 05:05:43PM +0200, Ard Biesheuvel wrote:
> > > On Thu, 22 Sept 2022 at 16:56, Demi Marie Obenour
> > > <demi@invisiblethingslab.com> wrote:
> > > >
> > > > On Thu, Sep 22, 2022 at 08:12:14AM +0200, Jan Beulich wrote:
> > > > > On 22.09.2022 03:09, Demi Marie Obenour wrote:
> > > > > > On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> > > > > >> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> > > > > >>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> > > > > >>>>
> > > > > >>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > > > > >>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > > > > >>>>> <demi@invisiblethingslab.com> wrote:
> > > > > >>>>>>
> > > > > >>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> > > > > >>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
> > > > > >>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> > > > > >>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> > > > > >>>>>> Qubes OS.
> > > > > >>>>>>
> > > > > >>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> > > > > >>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> > > > > >>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> > > > > >>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> > > > > >>>>>>
> > > > > >>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> > > > > >>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> > > > > >>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> > > > > >>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> > > > > >>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> > > > > >>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> > > > > >>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
> > > > > >>>>>> EfiRuntimeServicesData.
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>> Thanks for the elaborate explanation. This is really helpful.
> > > > > >>>>>
> > > > > >>>>> So here, you are explaining that the only way for Xen to prevent
> > > > > >>>>> itself from potentially clobbering the ESRT is by creating a
> > > > > >>>>> completely new allocation?
> > > > > >>>>
> > > > > >>>> There are surely other ways, e.g. preserving BootServices* regions
> > > > > >>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
> > > > > >>>> code in Xen I don't view this as a reasonable approach.
> > > > > >>>
> > > > > >>> Why not?
> > > > > >>
> > > > > >> Because it's against the intentions the EFI has (or at least had)
> > > > > >> for this memory type. Much more than EfiAcpiReclaimMemory this
> > > > > >> type is intended for use as ordinary RAM post-boot.
> > > > > >
> > > > > > What about giving that memory to dom0?  dom0’s balloon driver will give
> > > > > > anything dom0 doesn’t wind up using back to Xen.
> > > > >
> > > > > While perhaps in principle possible, this would require special casing
> > > > > in Xen. Except for the memory the initrd comes in, we don't directly
> > > > > hand memory to Dom0. Instead everything goes through the page allocator
> > > > > first. Plus if we really were convinced boot services memory needed
> > > > > retaining, then it would also need retaining across kexec (and hence
> > > > > shouldn't be left to Dom0 to decide what to do with it).
> > > >
> > > > So how should dom0 handle the various EFI tables other than the ESRT?
> > > > Right now most uses of these tables in Linux are not guarded by any
> > > > checks for efi_enabled(EFI_MEMMAP) or similar.  If some of them are in
> > > > EfiBootServicesData memory, they might be corrupted before Linux gets
> > > > them.
> > >
> > > Yes, this is an annoying oversight of the EFI design: the config
> > > tables are <guid, address> tuples, and the size of the table is
> > > specific to each table type. So without knowing the GUID, there is no
> > > way you can reserve the right size.
> > >
> > > Perhaps you could implement something like a hypercall in
> > > efi_arch_mem_reserve(), which is called by the EFI code to reserve
> > > regions that are in boot services memory but need to remain reserved?
> > > This is used for all config tables that it knows or cares about.
> >
> > On versions of Xen that support spawning multiple domains at boot
> > (instead of just dom0) this will be racy.  What about refusing to use
> > tables in EfiBootServicesData when running under Xen unless a hypercall
> > indicates that Xen has reserved all EfiBootServicesData memory?  Where
> > could such a check be placed?
> 
> You could stick a check inside the for loop in
> efi_config_parse_tables() to cross reference every table address
> against the memory map when running on Xen, and disregard it if it
> doesn't meet your criteria.
> 
> I take it the issue is not limited to x86?

Indeed the issue is cross-platform.  For Qubes OS, I wonder if a safer
approach would be to reserve all EfiBootServicesData memory by default.
Ard Biesheuvel Sept. 23, 2022, 6:45 a.m. UTC | #13
On Fri, 23 Sept 2022 at 01:27, Demi Marie Obenour
<demi@invisiblethingslab.com> wrote:
>
> On Fri, Sep 23, 2022 at 12:14:50AM +0200, Ard Biesheuvel wrote:
> > On Thu, 22 Sept 2022 at 20:12, Demi Marie Obenour
> > <demi@invisiblethingslab.com> wrote:
> > >
> > > On Thu, Sep 22, 2022 at 05:05:43PM +0200, Ard Biesheuvel wrote:
> > > > On Thu, 22 Sept 2022 at 16:56, Demi Marie Obenour
> > > > <demi@invisiblethingslab.com> wrote:
> > > > >
> > > > > On Thu, Sep 22, 2022 at 08:12:14AM +0200, Jan Beulich wrote:
> > > > > > On 22.09.2022 03:09, Demi Marie Obenour wrote:
> > > > > > > On Wed, Sep 21, 2022 at 10:34:04PM +0200, Jan Beulich wrote:
> > > > > > >> On 20.09.2022 18:09, Ard Biesheuvel wrote:
> > > > > > >>> On Tue, 20 Sept 2022 at 17:54, Jan Beulich <jbeulich@suse.com> wrote:
> > > > > > >>>>
> > > > > > >>>> On 20.09.2022 17:36, Ard Biesheuvel wrote:
> > > > > > >>>>> On Mon, 19 Sept 2022 at 21:33, Demi Marie Obenour
> > > > > > >>>>> <demi@invisiblethingslab.com> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>> fwupd requires access to the EFI System Resource Table (ESRT) to
> > > > > > >>>>>> discover which firmware can be updated by the OS.  Currently, Linux does
> > > > > > >>>>>> not expose the ESRT when running as a Xen dom0.  Therefore, it is not
> > > > > > >>>>>> possible to use fwupd in a Xen dom0, which is a serious problem for e.g.
> > > > > > >>>>>> Qubes OS.
> > > > > > >>>>>>
> > > > > > >>>>>> Before Xen 4.16, this was not fixable due to hypervisor limitations.
> > > > > > >>>>>> The UEFI specification requires the ESRT to be in EfiBootServicesData
> > > > > > >>>>>> memory, which Xen will use for whatever purposes it likes.  Therefore,
> > > > > > >>>>>> Linux cannot safely access the ESRT, as Xen may have overwritten it.
> > > > > > >>>>>>
> > > > > > >>>>>> Starting with Xen 4.17, Xen checks if the ESRT is in EfiBootServicesData
> > > > > > >>>>>> or EfiRuntimeServicesData memory.  If the ESRT is in EfiBootServicesData
> > > > > > >>>>>> memory, Xen allocates some memory of type EfiRuntimeServicesData, copies
> > > > > > >>>>>> the ESRT to it, and finally replaces the ESRT pointer with a pointer to
> > > > > > >>>>>> the copy.  Since Xen will not clobber EfiRuntimeServicesData memory,
> > > > > > >>>>>> this ensures that the ESRT can safely be accessed by the OS.  It is safe
> > > > > > >>>>>> to access the ESRT under Xen if, and only if, it is in memory of type
> > > > > > >>>>>> EfiRuntimeServicesData.
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>> Thanks for the elaborate explanation. This is really helpful.
> > > > > > >>>>>
> > > > > > >>>>> So here, you are explaining that the only way for Xen to prevent
> > > > > > >>>>> itself from potentially clobbering the ESRT is by creating a
> > > > > > >>>>> completely new allocation?
> > > > > > >>>>
> > > > > > >>>> There are surely other ways, e.g. preserving BootServices* regions
> > > > > > >>>> alongside RuntimeServices* ones. But as the maintainer of the EFI
> > > > > > >>>> code in Xen I don't view this as a reasonable approach.
> > > > > > >>>
> > > > > > >>> Why not?
> > > > > > >>
> > > > > > >> Because it's against the intentions the EFI has (or at least had)
> > > > > > >> for this memory type. Much more than EfiAcpiReclaimMemory this
> > > > > > >> type is intended for use as ordinary RAM post-boot.
> > > > > > >
> > > > > > > What about giving that memory to dom0?  dom0’s balloon driver will give
> > > > > > > anything dom0 doesn’t wind up using back to Xen.
> > > > > >
> > > > > > While perhaps in principle possible, this would require special casing
> > > > > > in Xen. Except for the memory the initrd comes in, we don't directly
> > > > > > hand memory to Dom0. Instead everything goes through the page allocator
> > > > > > first. Plus if we really were convinced boot services memory needed
> > > > > > retaining, then it would also need retaining across kexec (and hence
> > > > > > shouldn't be left to Dom0 to decide what to do with it).
> > > > >
> > > > > So how should dom0 handle the various EFI tables other than the ESRT?
> > > > > Right now most uses of these tables in Linux are not guarded by any
> > > > > checks for efi_enabled(EFI_MEMMAP) or similar.  If some of them are in
> > > > > EfiBootServicesData memory, they might be corrupted before Linux gets
> > > > > them.
> > > >
> > > > Yes, this is an annoying oversight of the EFI design: the config
> > > > tables are <guid, address> tuples, and the size of the table is
> > > > specific to each table type. So without knowing the GUID, there is no
> > > > way you can reserve the right size.
> > > >
> > > > Perhaps you could implement something like a hypercall in
> > > > efi_arch_mem_reserve(), which is called by the EFI code to reserve
> > > > regions that are in boot services memory but need to remain reserved?
> > > > This is used for all config tables that it knows or cares about.
> > >
> > > On versions of Xen that support spawning multiple domains at boot
> > > (instead of just dom0) this will be racy.  What about refusing to use
> > > tables in EfiBootServicesData when running under Xen unless a hypercall
> > > indicates that Xen has reserved all EfiBootServicesData memory?  Where
> > > could such a check be placed?
> >
> > You could stick a check inside the for loop in
> > efi_config_parse_tables() to cross reference every table address
> > against the memory map when running on Xen, and disregard it if it
> > doesn't meet your criteria.
> >
> > I take it the issue is not limited to x86?
>
> Indeed the issue is cross-platform.  For Qubes OS, I wonder if a safer
> approach would be to reserve all EfiBootServicesData memory by default.

You only need to reserve the ones that have configuration tables
pointing into them.
diff mbox series

Patch

diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
index 2a2f52b017e736dd995c69e8aeb5fbd7761732e5..378bf2ea770ad3bd747345a89258216919eb21bb 100644
--- a/drivers/firmware/efi/esrt.c
+++ b/drivers/firmware/efi/esrt.c
@@ -28,6 +28,11 @@ 
 #include <asm/io.h>
 #include <asm/early_ioremap.h>
 
+#ifdef CONFIG_XEN_EFI
+#include <asm/xen/hypercall.h>
+#include <xen/page.h>
+#endif
+
 struct efi_system_resource_entry_v1 {
 	efi_guid_t	fw_class;
 	u32		fw_type;
@@ -243,27 +248,67 @@  void __init efi_esrt_init(void)
 	void *va;
 	struct efi_system_resource_table tmpesrt;
 	size_t size, max, entry_size, entries_size;
-	efi_memory_desc_t md;
-	int rc;
 	phys_addr_t end;
-
-	if (!efi_enabled(EFI_MEMMAP))
-		return;
+	uint32_t type;
 
 	pr_debug("esrt-init: loading.\n");
 	if (!esrt_table_exists())
 		return;
 
-	rc = efi_mem_desc_lookup(efi.esrt, &md);
-	if (rc < 0 ||
-	    (!(md.attribute & EFI_MEMORY_RUNTIME) &&
-	     md.type != EFI_BOOT_SERVICES_DATA &&
-	     md.type != EFI_RUNTIME_SERVICES_DATA)) {
-		pr_warn("ESRT header is not in the memory map.\n");
+	if (efi_enabled(EFI_MEMMAP)) {
+		efi_memory_desc_t md;
+
+		if (efi_mem_desc_lookup(efi.esrt, &md) < 0 ||
+		    (!(md.attribute & EFI_MEMORY_RUNTIME) &&
+		     md.type != EFI_BOOT_SERVICES_DATA &&
+		     md.type != EFI_RUNTIME_SERVICES_DATA)) {
+			pr_warn("ESRT header is not in the memory map.\n");
+			return;
+		}
+
+		type = md.type;
+		max = efi_mem_desc_end(&md);
+#ifdef CONFIG_XEN_EFI
+	} else if (efi_enabled(EFI_PARAVIRT)) {
+		static_assert(XEN_PAGE_SHIFT == EFI_PAGE_SHIFT,
+			      "Mismatch between EFI_PAGE_SHIFT and XEN_PAGE_SHIFT");
+
+		struct xen_platform_op op = {
+			.cmd = XENPF_firmware_info,
+			.u.firmware_info = {
+				.type = XEN_FW_EFI_INFO,
+				.index = XEN_FW_EFI_MEM_INFO,
+				.u.efi_info.mem.addr = efi.esrt,
+				.u.efi_info.mem.size = ((u64)-1ULL) - efi.esrt,
+			}
+		};
+		union xenpf_efi_info *info = &op.u.firmware_info.u.efi_info;
+		int rc = HYPERVISOR_platform_op(&op);
+
+		if (rc) {
+			pr_warn("Failed to lookup ESRT header %lu in Xen memory map: error %d\n",
+				efi.esrt, rc);
+			return;
+		}
+		type = info->mem.type;
+		max = info->mem.addr + info->mem.size;
+
+		/*
+		 * Recent Xen versions relocate the ESRT to memory of type
+		 * EfiRuntimeServicesData, which Xen will not reuse.  If the ESRT
+		 * is not in EfiRuntimeServicesData memory, it has not been reserved
+		 * by Xen and might be allocated to other guests, so it cannot
+		 * safely be used.
+		 */
+		if (type != EFI_RUNTIME_SERVICES_DATA) {
+			pr_warn("Xen did not reserve ESRT, ignoring it\n");
+			return;
+		}
+#endif
+	} else {
 		return;
 	}
 
-	max = efi_mem_desc_end(&md);
 	if (max < efi.esrt) {
 		pr_err("EFI memory descriptor is invalid. (esrt: %p max: %p)\n",
 		       (void *)efi.esrt, (void *)max);
@@ -333,7 +378,7 @@  void __init efi_esrt_init(void)
 
 	end = esrt_data + size;
 	pr_info("Reserving ESRT space from %pa to %pa.\n", &esrt_data, &end);
-	if (md.type == EFI_BOOT_SERVICES_DATA)
+	if (type == EFI_BOOT_SERVICES_DATA)
 		efi_mem_reserve(esrt_data, esrt_data_size);
 
 	pr_debug("esrt-init: loaded.\n");