diff mbox

[v3,04/11] arm64: use fixmap region for permanent FDT mapping

Message ID 1428674035-26603-5-git-send-email-ard.biesheuvel@linaro.org
State New
Headers show

Commit Message

Ard Biesheuvel April 10, 2015, 1:53 p.m. UTC
Currently, the FDT blob needs to be in the same 512 MB region as
the kernel, so that it can be mapped into the kernel virtual memory
space very early on using a minimal set of statically allocated
translation tables.

Now that we have early fixmap support, we can relax this restriction,
by moving the permanent FDT mapping to the fixmap region instead.
This way, the FDT blob may be anywhere in memory.

This also moves the vetting of the FDT to setup.c, since the early
init code in head.S does not handle mapping of the FDT anymore.
At the same time, fix up some comments in head.S that have gone stale.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt |  8 +++----
 arch/arm64/include/asm/fixmap.h |  9 ++++++++
 arch/arm64/include/asm/mmu.h    |  1 +
 arch/arm64/kernel/Makefile      |  1 +
 arch/arm64/kernel/head.S        | 39 +--------------------------------
 arch/arm64/kernel/setup.c       | 45 ++++++++++++++++++++------------------
 arch/arm64/mm/init.c            |  1 -
 arch/arm64/mm/mmu.c             | 48 +++++++++++++++++++++++++++++++++++++++++
 8 files changed, 88 insertions(+), 64 deletions(-)

Comments

Ard Biesheuvel April 13, 2015, 3:15 p.m. UTC | #1
On 13 April 2015 at 17:02, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Ard,
>
>> -The device tree blob (dtb) must be placed on an 8-byte boundary within
>> -the first 512 megabytes from the start of the kernel image and must not
>> -cross a 2-megabyte boundary. This is to allow the kernel to map the
>> -blob using a single section mapping in the initial page tables.
>> +The device tree blob (dtb) must be placed on an 8-byte boundary and must
>> +not exceed 2 megabytes in size.
>>
>> +NOTE: versions prior to v4.2 also require that the dtb be placed within
>> +512 MB of the 2 MB aligned boundary right below the kernel Image.
>
> This reads a little odd (it sounds like you could load the DTB 512M
> below the boundary too).
>
> How about:
>
> NOTE: versions prior to v4.2 also require that the DTB be placed within
> the 512 MB region starting at text_offset bytes below the kernel Image.
>

OK

> [...]
>
>>  enum fixed_addresses {
>>       FIX_HOLE,
>> +
>> +     /*
>> +      * Reserve 4 MB of virtual space for the FDT at the top of the fixmap
>> +      * region. Keep this at the top so it remains 2 MB aligned.
>> +      */
>> +#define FIX_FDT_SIZE         SZ_4M
>> +     FIX_FDT_END,
>> +     FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
>> +
>>       FIX_EARLYCON_MEM_BASE,
>>       FIX_TEXT_POKE0,
>>       __end_of_permanent_fixed_addresses,
>
> [...]
>
>> +     /*
>> +      * Before passing the dt_virt pointer to early_init_dt_scan(), we have
>> +      * to ensure that the FDT size as reported in the FDT itself does not
>> +      * exceed the window we just mapped for it.
>> +      */
>> +     if (!dt_virt ||
>> +         fdt_check_header(dt_virt) != 0 ||
>> +         (dt_phys & (SZ_2M - 1)) + fdt_totalsize(dt_virt) > FIX_FDT_SIZE ||
>> +         !early_init_dt_scan(dt_virt)) {
>
> Doesn't this allow a DTB > 2M (violating the boot protocol's 2M
> restriction)?
>

Yes it does, and I was wondering what to do about it. We could be
pedantic, and fail here with no user visible feedback about what went
wrong, or we could be lax here, and add a warning later (similar to
the x1/x2/x3 check) that the boot protocol was violated if the FDT
exceeds 2 MB but otherwise proceed normally.

> Perhaps we want a MAX_FDT_SIZE, and have FIX_FDT_SIZE be MAX_FDT_SIZE +
> 2M. Then we can check that the DTB <= MAX_FDT_SIZE, and we know we will
> always be able to map it.
>
> Otherwise we could get intermittent failures for DTBs larger than 2MB,
> depending on where they're loaded. Always failing those for now would
> give us a consistent early warning that we need to bump MAX_FDT_SIZE.
>

Consistent, early, but emitted before we even have a console ...

> [...]
>
>> +void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
>> +{
>> +     const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
>> +     phys_addr_t dt_phys_base = dt_phys & ~(SZ_2M - 1);
>
> Do we definitely retain the high bits here? I don't have the C integer
> promotion rules paged into my head, but the definition in
> include/linux/sizes.h makes me suspicious, given it should have an int
> type:
>
> #define SZ_2M                           0x00200000
>

I will use round_down() instead

>> +     u64 pfn = dt_phys_base >> PAGE_SHIFT;
>> +     u64 pfn_end = pfn + (FIX_FDT_SIZE >> PAGE_SHIFT);
>> +     pgprot_t prot = PAGE_KERNEL | PTE_RDONLY;
>> +
>> +     /*
>> +      * Make sure that the FDT region can be mapped without the need to
>> +      * allocate additional translation table pages, or perform physical
>> +      * to virtual address translations via the linear mapping.
>> +      *
>> +      * On 4k pages, we'll use section mappings for the region so we
>> +      * only have to be in the same PUD as the rest of the fixmap.
>> +      * On 64k pages, we need to be in the same PMD as well, as the region
>> +      * will be mapped using PTEs.
>> +      */
>
> It would be nice to have the last two statements swapped (with
> appropriate wording fixups) to match the order of the if branches.
>

ok

> Otherwise this looks good to me.
>

thanks.
Ard Biesheuvel April 13, 2015, 3:45 p.m. UTC | #2
On 13 April 2015 at 17:26, Mark Rutland <mark.rutland@arm.com> wrote:
>> >> +     /*
>> >> +      * Before passing the dt_virt pointer to early_init_dt_scan(), we have
>> >> +      * to ensure that the FDT size as reported in the FDT itself does not
>> >> +      * exceed the window we just mapped for it.
>> >> +      */
>> >> +     if (!dt_virt ||
>> >> +         fdt_check_header(dt_virt) != 0 ||
>> >> +         (dt_phys & (SZ_2M - 1)) + fdt_totalsize(dt_virt) > FIX_FDT_SIZE ||
>> >> +         !early_init_dt_scan(dt_virt)) {
>> >
>> > Doesn't this allow a DTB > 2M (violating the boot protocol's 2M
>> > restriction)?
>> >
>>
>> Yes it does, and I was wondering what to do about it. We could be
>> pedantic, and fail here with no user visible feedback about what went
>> wrong, or we could be lax here, and add a warning later (similar to
>> the x1/x2/x3 check) that the boot protocol was violated if the FDT
>> exceeds 2 MB but otherwise proceed normally.
>>
>> > Perhaps we want a MAX_FDT_SIZE, and have FIX_FDT_SIZE be MAX_FDT_SIZE +
>> > 2M. Then we can check that the DTB <= MAX_FDT_SIZE, and we know we will
>> > always be able to map it.
>> >
>> > Otherwise we could get intermittent failures for DTBs larger than 2MB,
>> > depending on where they're loaded. Always failing those for now would
>> > give us a consistent early warning that we need to bump MAX_FDT_SIZE.
>> >
>>
>> Consistent, early, but emitted before we even have a console ...
>
> When I say warning, I mean that boot will consistently fail, rather than
> there being a pr_warn.
>
> People seem to wait until their platform is hilariously broken before
> reporting bugs...
>

Yeah, that is true. At this stage, there is still room to be pedantic.

I will take your MAX_FDT_SiZE/FIX_FDT_SIZE suggestion, and check
fdt_totalsize() against the former instead.

I was wondering: this code now always maps the full 4 MB, also on 64k
pages, even if in the majority of cases, the actual FDT is much
smaller. Do you think there is a downside to that?
Ard Biesheuvel April 14, 2015, 7:44 a.m. UTC | #3
On 13 apr. 2015, at 18:26, Mark Rutland <mark.rutland@arm.com> wrote:

>> I was wondering: this code now always maps the full 4 MB, also on 64k
>> pages, even if in the majority of cases, the actual FDT is much
>> smaller. Do you think there is a downside to that?
> 
> The two cases I can think of as problematic are when the FDT is in the
> last 2M of RAM, or in a 2M region immediately before some carveout which
> won't use MT_NORMAL attributes.
> 
> We could solve those by only mapping however many 2M chunks are
> necessary, or we could keep the requirement that it fits within a
> naturally-aligned 2M region for now.
> 

I strongly prefer the former, since otherwise, you always need to verify the allocation explicitly, for an fdt of any size, and it turns out the logic is easily updated to move the vetting into fixmap_remap_fdt() and check the magic and size after mapping the first chunk, and set pfn_end accordingly

> I don't see that this should yield significant problems otherwise.
> 
> Mark.
diff mbox

Patch

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index f3c05b5f9f08..6396460f6085 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -45,11 +45,11 @@  sees fit.)
 
 Requirement: MANDATORY
 
-The device tree blob (dtb) must be placed on an 8-byte boundary within
-the first 512 megabytes from the start of the kernel image and must not
-cross a 2-megabyte boundary. This is to allow the kernel to map the
-blob using a single section mapping in the initial page tables.
+The device tree blob (dtb) must be placed on an 8-byte boundary and must
+not exceed 2 megabytes in size.
 
+NOTE: versions prior to v4.2 also require that the dtb be placed within
+512 MB of the 2 MB aligned boundary right below the kernel Image.
 
 3. Decompress the kernel image
 ------------------------------
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 926495686554..fb3557c6901f 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -32,6 +32,15 @@ 
  */
 enum fixed_addresses {
 	FIX_HOLE,
+
+	/*
+	 * Reserve 4 MB of virtual space for the FDT at the top of the fixmap
+	 * region. Keep this at the top so it remains 2 MB aligned.
+	 */
+#define FIX_FDT_SIZE		SZ_4M
+	FIX_FDT_END,
+	FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
+
 	FIX_EARLYCON_MEM_BASE,
 	FIX_TEXT_POKE0,
 	__end_of_permanent_fixed_addresses,
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 3d311761e3c2..79fcfb048884 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -34,5 +34,6 @@  extern void init_mem_pgprot(void);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot);
+extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
 
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index d5e70747c7a2..772831cf6937 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -6,6 +6,7 @@  CPPFLAGS_vmlinux.lds	:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 CFLAGS_efi-stub.o 	:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 CFLAGS_armv8_deprecated.o := -I$(src)
+CFLAGS_setup.o		:= -I$(srctree)/scripts/dtc/libfdt/
 
 CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_insn.o = -pg
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index f54125a95a6d..208ca21868cc 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -237,8 +237,6 @@  ENTRY(stext)
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
 	adrp	x24, __PHYS_OFFSET
 	bl	set_cpu_boot_mode_flag
-
-	bl	__vet_fdt
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -270,24 +268,6 @@  preserve_boot_args:
 ENDPROC(preserve_boot_args)
 
 /*
- * Determine validity of the x21 FDT pointer.
- * The dtb must be 8-byte aligned and live in the first 512M of memory.
- */
-__vet_fdt:
-	tst	x21, #0x7
-	b.ne	1f
-	cmp	x21, x24
-	b.lt	1f
-	mov	x0, #(1 << 29)
-	add	x0, x0, x24
-	cmp	x21, x0
-	b.ge	1f
-	ret
-1:
-	mov	x21, #0
-	ret
-ENDPROC(__vet_fdt)
-/*
  * Macro to create a table entry to the next page.
  *
  *	tbl:	page table address
@@ -348,8 +328,7 @@  ENDPROC(__vet_fdt)
  * required to get the kernel running. The following sections are required:
  *   - identity mapping to enable the MMU (low address, TTBR0)
  *   - first few MB of the kernel linear mapping to jump to once the MMU has
- *     been enabled, including the FDT blob (TTBR1)
- *   - pgd entry for fixed mappings (TTBR1)
+ *     been enabled
  */
 __create_page_tables:
 	adrp	x25, idmap_pg_dir
@@ -439,22 +418,6 @@  __create_page_tables:
 	create_block_map x0, x7, x3, x5, x6
 
 	/*
-	 * Map the FDT blob (maximum 2MB; must be within 512MB of
-	 * PHYS_OFFSET).
-	 */
-	mov	x3, x21				// FDT phys address
-	and	x3, x3, #~((1 << 21) - 1)	// 2MB aligned
-	mov	x6, #PAGE_OFFSET
-	sub	x5, x3, x24			// subtract PHYS_OFFSET
-	tst	x5, #~((1 << 29) - 1)		// within 512MB?
-	csel	x21, xzr, x21, ne		// zero the FDT pointer
-	b.ne	1f
-	add	x5, x5, x6			// __va(FDT blob)
-	add	x6, x5, #1 << 21		// 2MB for the FDT blob
-	sub	x6, x6, #1			// inclusive range
-	create_block_map x0, x7, x3, x5, x6
-1:
-	/*
 	 * Since the page tables have been populated with non-cacheable
 	 * accesses (MMU disabled), invalidate the idmap and swapper page
 	 * tables again to remove any speculatively loaded cache lines.
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 51ef97274b52..40675229e95c 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -45,6 +45,7 @@ 
 #include <linux/of_platform.h>
 #include <linux/efi.h>
 #include <linux/personality.h>
+#include <linux/libfdt.h>
 
 #include <asm/fixmap.h>
 #include <asm/cpu.h>
@@ -103,18 +104,6 @@  static struct resource mem_res[] = {
 #define kernel_code mem_res[0]
 #define kernel_data mem_res[1]
 
-void __init early_print(const char *str, ...)
-{
-	char buf[256];
-	va_list ap;
-
-	va_start(ap, str);
-	vsnprintf(buf, sizeof(buf), str, ap);
-	va_end(ap);
-
-	printk("%s", buf);
-}
-
 /*
  * The recorded values of x0 .. x3 upon kernel entry.
  */
@@ -324,17 +313,31 @@  static void __init setup_processor(void)
 
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
 {
-	if (!dt_phys || !early_init_dt_scan(phys_to_virt(dt_phys))) {
-		early_print("\n"
-			"Error: invalid device tree blob at physical address 0x%p (virtual address 0x%p)\n"
-			"The dtb must be 8-byte aligned and passed in the first 512MB of memory\n"
-			"\nPlease check your bootloader.\n",
-			dt_phys, phys_to_virt(dt_phys));
+	void *dt_virt = NULL;
+
+	if (dt_phys && (dt_phys & 7) == 0)
+		dt_virt = fixmap_remap_fdt(dt_phys);
+
+	/*
+	 * Before passing the dt_virt pointer to early_init_dt_scan(), we have
+	 * to ensure that the FDT size as reported in the FDT itself does not
+	 * exceed the window we just mapped for it.
+	 */
+	if (!dt_virt ||
+	    fdt_check_header(dt_virt) != 0 ||
+	    (dt_phys & (SZ_2M - 1)) + fdt_totalsize(dt_virt) > FIX_FDT_SIZE ||
+	    !early_init_dt_scan(dt_virt)) {
+		pr_crit("\n"
+			"Error: invalid device tree blob at physical address %pa (virtual address 0x%p)\n"
+			"The dtb must be 8-byte aligned and must not exceed 2 MB in size\n"
+			"\nPlease check your bootloader.",
+			&dt_phys, dt_virt);
 
 		while (true)
 			cpu_relax();
 	}
 
+	memblock_reserve(dt_phys, fdt_totalsize(dt_virt));
 	dump_stack_set_arch_desc("%s (DT)", of_flat_dt_get_machine_name());
 }
 
@@ -372,6 +375,9 @@  void __init setup_arch(char **cmdline_p)
 {
 	setup_processor();
 
+	early_fixmap_init();
+	early_ioremap_init();
+
 	setup_machine_fdt(__fdt_pointer);
 
 	init_mm.start_code = (unsigned long) _text;
@@ -381,9 +387,6 @@  void __init setup_arch(char **cmdline_p)
 
 	*cmdline_p = boot_command_line;
 
-	early_fixmap_init();
-	early_ioremap_init();
-
 	parse_early_param();
 
 	/*
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index fa2389b0f7f0..ae85da6307bb 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -170,7 +170,6 @@  void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
-	early_init_fdt_reserve_self();
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 428aaf86c95b..60be58a160a2 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -643,3 +643,51 @@  void __set_fixmap(enum fixed_addresses idx,
 		flush_tlb_kernel_range(addr, addr+PAGE_SIZE);
 	}
 }
+
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
+	phys_addr_t dt_phys_base = dt_phys & ~(SZ_2M - 1);
+	u64 pfn = dt_phys_base >> PAGE_SHIFT;
+	u64 pfn_end = pfn + (FIX_FDT_SIZE >> PAGE_SHIFT);
+	pgprot_t prot = PAGE_KERNEL | PTE_RDONLY;
+
+	/*
+	 * Make sure that the FDT region can be mapped without the need to
+	 * allocate additional translation table pages, or perform physical
+	 * to virtual address translations via the linear mapping.
+	 *
+	 * On 4k pages, we'll use section mappings for the region so we
+	 * only have to be in the same PUD as the rest of the fixmap.
+	 * On 64k pages, we need to be in the same PMD as well, as the region
+	 * will be mapped using PTEs.
+	 */
+	BUILD_BUG_ON(dt_virt_base & (SZ_2M - 1));
+
+	if (IS_ENABLED(CONFIG_ARM64_64K_PAGES)) {
+		pte_t *pte;
+
+		BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PMD_SHIFT !=
+			     __fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT);
+
+		/* map the FDT using 64k pages */
+		pte = fixmap_pte(dt_virt_base);
+		do {
+			set_pte(pte++, pfn_pte(pfn++, prot));
+		} while (pfn < pfn_end);
+	} else {
+		pmd_t *pmd;
+
+		BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PUD_SHIFT !=
+			     __fix_to_virt(FIX_BTMAP_BEGIN) >> PUD_SHIFT);
+
+		/* map the FDT using 2 MB sections */
+		pmd = fixmap_pmd(dt_virt_base);
+		do {
+			set_pmd(pmd++, pfn_pmd(pfn, mk_sect_prot(prot)));
+			pfn += PMD_SIZE >> PAGE_SHIFT;
+		} while (pfn < pfn_end);
+	}
+
+	return (void *)(dt_virt_base + dt_phys - dt_phys_base);
+}