diff mbox

efi/arm64: fix fdt-related memory reservation

Message ID 1410183102-6969-1-git-send-email-msalter@redhat.com
State New
Headers show

Commit Message

Mark Salter Sept. 8, 2014, 1:31 p.m. UTC
Commit 86c8b27a01cf:
 "arm64: ignore DT memreserve entries when booting in UEFI mode

prevents early_init_fdt_scan_reserved_mem() from being called for
arm64 kernels booting via UEFI. This was done because the kernel
will use the UEFI memory map to determine reserved memory regions.
That approach has problems in that early_init_fdt_scan_reserved_mem()
also reserves the FDT itself and any node-specific reserved memory.
By chance of some kernel configs, the FDT may be overwritten before
it can be unflattened and the kernel will fail to boot. More subtle
problems will result if the FDT has node specific reserved memory
which is not really reserved.

This patch has the UEFI stub remove the memory reserve map entries
from the FDT as it does with the memory nodes. This allows
early_init_fdt_scan_reserved_mem() to be called unconditionally
so that the other needed reservations are made.

Signed-off-by: Mark Salter <msalter@redhat.com>
---
 arch/arm64/mm/init.c               |  3 +--
 drivers/firmware/efi/libstub/fdt.c | 10 +++++++++-
 2 files changed, 10 insertions(+), 3 deletions(-)

Comments

Ard Biesheuvel Sept. 8, 2014, 1:56 p.m. UTC | #1
On 8 September 2014 15:31, Mark Salter <msalter@redhat.com> wrote:
> Commit 86c8b27a01cf:
>  "arm64: ignore DT memreserve entries when booting in UEFI mode
>
> prevents early_init_fdt_scan_reserved_mem() from being called for
> arm64 kernels booting via UEFI. This was done because the kernel
> will use the UEFI memory map to determine reserved memory regions.
> That approach has problems in that early_init_fdt_scan_reserved_mem()
> also reserves the FDT itself and any node-specific reserved memory.
> By chance of some kernel configs, the FDT may be overwritten before
> it can be unflattened and the kernel will fail to boot. More subtle
> problems will result if the FDT has node specific reserved memory
> which is not really reserved.
>
> This patch has the UEFI stub remove the memory reserve map entries
> from the FDT as it does with the memory nodes. This allows
> early_init_fdt_scan_reserved_mem() to be called unconditionally
> so that the other needed reservations are made.
>
> Signed-off-by: Mark Salter <msalter@redhat.com>

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Interesting. I noticed the other day that my FDT got clobbered when
accidentally booting with very little memory on QEMU , but I hadn't
looked into it yet in more detail.

Matt: could you put this in your 'urgent' queue as well, please?

Cheers,
Ard.


> ---
>  arch/arm64/mm/init.c               |  3 +--
>  drivers/firmware/efi/libstub/fdt.c | 10 +++++++++-
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 5472c24..a83061f 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -149,8 +149,7 @@ void __init arm64_memblock_init(void)
>                 memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>  #endif
>
> -       if (!efi_enabled(EFI_MEMMAP))
> -               early_init_fdt_scan_reserved_mem();
> +       early_init_fdt_scan_reserved_mem();
>
>         /* 4GB maximum for 32-bit only capable devices */
>         if (IS_ENABLED(CONFIG_ZONE_DMA))
> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index a56bb35..723c7da 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -22,7 +22,7 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
>                         unsigned long map_size, unsigned long desc_size,
>                         u32 desc_ver)
>  {
> -       int node, prev;
> +       int node, prev, i, num_rsv;
>         int status;
>         u32 fdt_val32;
>         u64 fdt_val64;
> @@ -73,6 +73,14 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
>                 prev = node;
>         }
>
> +       /*
> +        * Delete all memory reserve map entries. When booting via UEFI,
> +        * kernel will use the UEFI memory map to find reserved regions.
> +        */
> +       num_rsv = fdt_num_mem_rsv(fdt);
> +       for (i = 0; i < num_rsv; i++)
> +               fdt_del_mem_rsv(fdt, i);
> +
>         node = fdt_subnode_offset(fdt, 0, "chosen");
>         if (node < 0) {
>                 node = fdt_add_subnode(fdt, 0, "chosen");
> --
> 1.8.3.1
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Mark Rutland Sept. 8, 2014, 2:06 p.m. UTC | #2
Hi Mark,

On Mon, Sep 08, 2014 at 02:31:42PM +0100, Mark Salter wrote:
> Commit 86c8b27a01cf:
>  "arm64: ignore DT memreserve entries when booting in UEFI mode
> 
> prevents early_init_fdt_scan_reserved_mem() from being called for
> arm64 kernels booting via UEFI. This was done because the kernel
> will use the UEFI memory map to determine reserved memory regions.
> That approach has problems in that early_init_fdt_scan_reserved_mem()
> also reserves the FDT itself and any node-specific reserved memory.
> By chance of some kernel configs, the FDT may be overwritten before
> it can be unflattened and the kernel will fail to boot. More subtle
> problems will result if the FDT has node specific reserved memory
> which is not really reserved.

That doesn't sound like fun; apologies for allowing such brokenness
through in the first place.

[...]

> +	/*
> +	 * Delete all memory reserve map entries. When booting via UEFI,
> +	 * kernel will use the UEFI memory map to find reserved regions.
> +	 */
> +	num_rsv = fdt_num_mem_rsv(fdt);
> +	for (i = 0; i < num_rsv; i++)
> +		fdt_del_mem_rsv(fdt, i);

I don't think that's right. Won't the memreserve entries shift down by
one each time we call fdt_del_mem_rsv?

Shouldn't this be something like:

while (fdt_num_mem_rsv(fdt))
	fdt_del_mem_rsv(fdt, 0);

Or we could count downwards.

Otherwise, the general approach sounds sane to me, so with that bug
fixed or disproven:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Given this is mostly in the EFI stub I expect that this will go via the
EFI tree?

Mark.

> +
>  	node = fdt_subnode_offset(fdt, 0, "chosen");
>  	if (node < 0) {
>  		node = fdt_add_subnode(fdt, 0, "chosen");
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-efi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Mark Salter Sept. 8, 2014, 2:21 p.m. UTC | #3
On Mon, 2014-09-08 at 15:06 +0100, Mark Rutland wrote:
> Hi Mark,
> 
> On Mon, Sep 08, 2014 at 02:31:42PM +0100, Mark Salter wrote:
> > Commit 86c8b27a01cf:
> >  "arm64: ignore DT memreserve entries when booting in UEFI mode
> > 
> > prevents early_init_fdt_scan_reserved_mem() from being called for
> > arm64 kernels booting via UEFI. This was done because the kernel
> > will use the UEFI memory map to determine reserved memory regions.
> > That approach has problems in that early_init_fdt_scan_reserved_mem()
> > also reserves the FDT itself and any node-specific reserved memory.
> > By chance of some kernel configs, the FDT may be overwritten before
> > it can be unflattened and the kernel will fail to boot. More subtle
> > problems will result if the FDT has node specific reserved memory
> > which is not really reserved.
> 
> That doesn't sound like fun; apologies for allowing such brokenness
> through in the first place.

Heh. It was obvious that DT unflattening was broken, but bisecting
didn't help much because I kept finding patches which when reverted
made the problem go away even though they obviously weren't the
cause.

> 
> [...]
> 
> > +	/*
> > +	 * Delete all memory reserve map entries. When booting via UEFI,
> > +	 * kernel will use the UEFI memory map to find reserved regions.
> > +	 */
> > +	num_rsv = fdt_num_mem_rsv(fdt);
> > +	for (i = 0; i < num_rsv; i++)
> > +		fdt_del_mem_rsv(fdt, i);
> 
> I don't think that's right. Won't the memreserve entries shift down by
> one each time we call fdt_del_mem_rsv?
> 
> Shouldn't this be something like:
> 
> while (fdt_num_mem_rsv(fdt))
> 	fdt_del_mem_rsv(fdt, 0);
> 
> Or we could count downwards.
> 

Sigh. Yes, you are right. I only tested with one reserved region.
I think counting down would be the way to go. I'll send a fixed
patch shortly.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Mark Rutland Sept. 8, 2014, 2:28 p.m. UTC | #4
On Mon, Sep 08, 2014 at 03:21:05PM +0100, Mark Salter wrote:
> On Mon, 2014-09-08 at 15:06 +0100, Mark Rutland wrote:
> > Hi Mark,
> > 
> > On Mon, Sep 08, 2014 at 02:31:42PM +0100, Mark Salter wrote:
> > > Commit 86c8b27a01cf:
> > >  "arm64: ignore DT memreserve entries when booting in UEFI mode
> > > 
> > > prevents early_init_fdt_scan_reserved_mem() from being called for
> > > arm64 kernels booting via UEFI. This was done because the kernel
> > > will use the UEFI memory map to determine reserved memory regions.
> > > That approach has problems in that early_init_fdt_scan_reserved_mem()
> > > also reserves the FDT itself and any node-specific reserved memory.
> > > By chance of some kernel configs, the FDT may be overwritten before
> > > it can be unflattened and the kernel will fail to boot. More subtle
> > > problems will result if the FDT has node specific reserved memory
> > > which is not really reserved.
> > 
> > That doesn't sound like fun; apologies for allowing such brokenness
> > through in the first place.
> 
> Heh. It was obvious that DT unflattening was broken, but bisecting
> didn't help much because I kept finding patches which when reverted
> made the problem go away even though they obviously weren't the
> cause.

Yeah, those types of bugs are never fun. I recall a similar situation
with the __INIT annotation in the versatile pen code. 

> > [...]
> > 
> > > +	/*
> > > +	 * Delete all memory reserve map entries. When booting via UEFI,
> > > +	 * kernel will use the UEFI memory map to find reserved regions.
> > > +	 */
> > > +	num_rsv = fdt_num_mem_rsv(fdt);
> > > +	for (i = 0; i < num_rsv; i++)
> > > +		fdt_del_mem_rsv(fdt, i);
> > 
> > I don't think that's right. Won't the memreserve entries shift down by
> > one each time we call fdt_del_mem_rsv?
> > 
> > Shouldn't this be something like:
> > 
> > while (fdt_num_mem_rsv(fdt))
> > 	fdt_del_mem_rsv(fdt, 0);
> > 
> > Or we could count downwards.
> > 
> 
> Sigh. Yes, you are right. I only tested with one reserved region.
> I think counting down would be the way to go. I'll send a fixed
> patch shortly.

Ok, that sounds fine by me; make sure you add my ack :)

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5472c24..a83061f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -149,8 +149,7 @@  void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
-	if (!efi_enabled(EFI_MEMMAP))
-		early_init_fdt_scan_reserved_mem();
+	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
 	if (IS_ENABLED(CONFIG_ZONE_DMA))
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index a56bb35..723c7da 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -22,7 +22,7 @@  efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 			unsigned long map_size, unsigned long desc_size,
 			u32 desc_ver)
 {
-	int node, prev;
+	int node, prev, i, num_rsv;
 	int status;
 	u32 fdt_val32;
 	u64 fdt_val64;
@@ -73,6 +73,14 @@  efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 		prev = node;
 	}
 
+	/*
+	 * Delete all memory reserve map entries. When booting via UEFI,
+	 * kernel will use the UEFI memory map to find reserved regions.
+	 */
+	num_rsv = fdt_num_mem_rsv(fdt);
+	for (i = 0; i < num_rsv; i++)
+		fdt_del_mem_rsv(fdt, i);
+
 	node = fdt_subnode_offset(fdt, 0, "chosen");
 	if (node < 0) {
 		node = fdt_add_subnode(fdt, 0, "chosen");