diff mbox

[v3,2/5] arm64: use fixmap region for permanent FDT mapping

Message ID 1426698308-726-3-git-send-email-ard.biesheuvel@linaro.org
State New
Headers show

Commit Message

Ard Biesheuvel March 18, 2015, 5:05 p.m. UTC
Currently, the FDT blob needs to be in the same naturally aligned
512 MB region as the kernel, so that it can be mapped into the
kernel virtual memory space very early on using a minimal set of
statically allocated translation tables.

Now that we have early fixmap support, we can relax this restriction,
by moving the permanent FDT mapping to the fixmap region instead.
This way, the FDT blob may be anywhere in memory.

This also moves the vetting of the FDT to setup.c, since the early
init code in head.S does not handle mapping of the FDT anymore.
At the same time, fix up some comments in head.S that have gone stale.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt | 10 +++---
 arch/arm64/include/asm/fixmap.h |  9 ++++++
 arch/arm64/kernel/Makefile      |  1 +
 arch/arm64/kernel/head.S        | 38 +---------------------
 arch/arm64/kernel/setup.c       | 72 +++++++++++++++++++++++++++++------------
 arch/arm64/mm/init.c            |  1 -
 6 files changed, 69 insertions(+), 62 deletions(-)

Comments

Ard Biesheuvel April 9, 2015, 12:12 p.m. UTC | #1
On 9 April 2015 at 13:49, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Ard,
>
> This looks good to me (with some minor comments below).
>
> As a heads-up, this clashes with other changes in the arm64
> for-next/core branch due to some unrelated changes in head.S and
> setup.c.
>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index f3c05b5f9f08..ab5a90adece3 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -45,11 +45,13 @@ sees fit.)
>>
>>  Requirement: MANDATORY
>>
>> -The device tree blob (dtb) must be placed on an 8-byte boundary within
>> -the first 512 megabytes from the start of the kernel image and must not
>> -cross a 2-megabyte boundary. This is to allow the kernel to map the
>> -blob using a single section mapping in the initial page tables.
>> +The device tree blob (dtb) must be placed on an 8-byte boundary and must
>> +not cross a 2-megabyte boundary. This is to allow the kernel to map the
>> +blob using a single section mapping in the fixmap region.
>
> Please drop the mention of the fixmap. If we ever move this out of the
> fixmap I don't want to have to update booting.txt again, and it
> shouldn't matter to bootloader authors either way.
>

OK.

>> +NOTE: versions prior to v4.1 require, in addition to the requirements
>> +listed above, that the dtb be placed above the kernel Image inside the
>> +same naturally aligned 512 MB region.
>
> Minor nit: This would read a little better if we just said "versions
> prior to v4.1 also require that ...", dropping "in addition to the
> requirements listed above".
>

OK

Actually, I realized that this is incorrect anyway: it is not the same
naturally aligned 512 MB region, it is actually 'within 512 MB of the
start of the kernel Image' since that itself is naturally aligned to
512 MB in the virtual address space.

>>  enum fixed_addresses {
>>       FIX_HOLE,
>> +
>> +     /*
>> +      * Reserve 2 MB of virtual space for the FDT at the top of the fixmap
>> +      * region. Keep this at the top so it remains 2 MB aligned.
>> +      */
>> +#define FIX_FDT_SIZE         SZ_2M
>> +     FIX_FDT_END,
>> +     FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
>> +
>>       FIX_EARLYCON_MEM_BASE,
>>       FIX_TEXT_POKE0,
>>       __end_of_permanent_fixed_addresses,
>
> [...]
>
>> +static void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
>> +{
>> +     const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
>> +     phys_addr_t dt_phys_base = dt_phys & ~(FIX_FDT_SIZE - 1);
>> +
>> +     /*
>> +      * Make sure that the FDT region can be mapped without the need to
>> +      * allocate additional translation table pages, so that it is safe
>> +      * to call create_pgd_mapping() this early.
>> +      * On 4k pages, we'll use section mappings for the region so we
>> +      * only have to be in the same PUD as the rest of the fixmap.
>> +      * On 64k pages, we need to be in the same PMD as well, as the region
>> +      * will be mapped using PTEs.
>> +      */
>> +     BUILD_BUG_ON(dt_virt_base & (SZ_2M - 1));
>
> s/SZ_2M/FIX_FDT_SIZE/
>
> Perhaps we should also have:
>
> BUILD_BUG_ON_NOT_POWER_OF_2(FIX_FDT_SIZE);
>
> ... given it's necessary for the address masking we do here.
>
> [...]
>

I was going to suggest to use SZ_2M for the masking, but map a 4 MB
window. That way, we can relax the requirement from "should not cross
a 2 MB boundary' to 'should not exceed 2 MB in size', which is
arguable easier to adhere to, since the latter is a property of the
DTB itself, whereas the former is a property of whatever your malloc()
gives you. That means the mask would remain SZ_2M, and the size would
become SZ_4M


>> +     /*
>> +      * Before passing the dt_virt pointer to early_init_dt_scan(), we have
>> +      * to ensure that the FDT size as reported in the FDT itself does not
>> +      * exceed the 2 MB window we just mapped for it.
>> +      */
>> +     if (!dt_virt ||
>> +         fdt_check_header(dt_virt) != 0 ||
>> +         (dt_phys & (SZ_2M - 1)) + fdt_totalsize(dt_virt) > SZ_2M ||
>> +         !early_init_dt_scan(dt_virt)) {
>> +             pr_crit("\n"
>> +                     "Error: invalid device tree blob at physical address %pa (virtual address 0x%p)\n"
>> +                     "The dtb must be 8-byte aligned and must not cross a 2 MB alignment boundary\n"
>>                       "\nPlease check your bootloader.\n",
>> -                     dt_phys, phys_to_virt(dt_phys));
>> +                     &dt_phys, dt_virt);
>
> Nit: drop the leading '\n' on the last line. There's no need for it.
>

OK

> I've tested this on my Juno (atop of arm64 for-next/core), and all seems
> well, so with those points addressed:
>
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Tested-by: Mark Rutland <mark.rutland@arm.com>

Thanks, but this code will likely look quite different if we are going
to cater for moving the kernel out of the linear range, so by then I
probably won't be able to retain these tags.

Cheers,
Ard.
Ard Biesheuvel April 9, 2015, 1:16 p.m. UTC | #2
On 9 April 2015 at 15:12, Mark Rutland <mark.rutland@arm.com> wrote:
>> >> +NOTE: versions prior to v4.1 require, in addition to the requirements
>> >> +listed above, that the dtb be placed above the kernel Image inside the
>> >> +same naturally aligned 512 MB region.
>> >
>> > Minor nit: This would read a little better if we just said "versions
>> > prior to v4.1 also require that ...", dropping "in addition to the
>> > requirements listed above".
>> >
>>
>> OK
>>
>> Actually, I realized that this is incorrect anyway: it is not the same
>> naturally aligned 512 MB region, it is actually 'within 512 MB of the
>> start of the kernel Image' since that itself is naturally aligned to
>> 512 MB in the virtual address space.
>
> Surely starting at TEXT_OFFSET bytes below the kernel image? ;)
>

Only if you find it acceptable that people start putting their FDT in
the TEXT_OFFSET region ... :-)

>> >> +static void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
>> >> +{
>> >> +     const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
>> >> +     phys_addr_t dt_phys_base = dt_phys & ~(FIX_FDT_SIZE - 1);
>> >> +
>> >> +     /*
>> >> +      * Make sure that the FDT region can be mapped without the need to
>> >> +      * allocate additional translation table pages, so that it is safe
>> >> +      * to call create_pgd_mapping() this early.
>> >> +      * On 4k pages, we'll use section mappings for the region so we
>> >> +      * only have to be in the same PUD as the rest of the fixmap.
>> >> +      * On 64k pages, we need to be in the same PMD as well, as the region
>> >> +      * will be mapped using PTEs.
>> >> +      */
>> >> +     BUILD_BUG_ON(dt_virt_base & (SZ_2M - 1));
>> >
>> > s/SZ_2M/FIX_FDT_SIZE/
>> >
>> > Perhaps we should also have:
>> >
>> > BUILD_BUG_ON_NOT_POWER_OF_2(FIX_FDT_SIZE);
>> >
>> > ... given it's necessary for the address masking we do here.
>> >
>> > [...]
>> >
>>
>> I was going to suggest to use SZ_2M for the masking, but map a 4 MB
>> window. That way, we can relax the requirement from "should not cross
>> a 2 MB boundary' to 'should not exceed 2 MB in size', which is
>> arguable easier to adhere to, since the latter is a property of the
>> DTB itself, whereas the former is a property of whatever your malloc()
>> gives you. That means the mask would remain SZ_2M, and the size would
>> become SZ_4M
>
> That sounds good so long as it doesn't get in the way of increasing
> FIX_FDT_SIZE in future.
>

No, it shouldn't. And it actually makes it more obvious when we do
increase it later which values should be left alone (SZ_2M) and which
values should be increased (the higher ones)

>> > I've tested this on my Juno (atop of arm64 for-next/core), and all seems
>> > well, so with those points addressed:
>> >
>> > Reviewed-by: Mark Rutland <mark.rutland@arm.com>
>> > Tested-by: Mark Rutland <mark.rutland@arm.com>
>>
>> Thanks, but this code will likely look quite different if we are going
>> to cater for moving the kernel out of the linear range, so by then I
>> probably won't be able to retain these tags.
>
> Ok. I guess you'll merge the two series together?
>

Yes. I am currently looking into a way to make early_fixmap_init() not
rely on __va(), and if that is easily doable, I will propose something
for the FDT which is quite close to the latest submitted version. But
otherwise, it may be a bit more complicated, and the FDT mapping needs
more work.
diff mbox

Patch

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index f3c05b5f9f08..ab5a90adece3 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -45,11 +45,13 @@  sees fit.)
 
 Requirement: MANDATORY
 
-The device tree blob (dtb) must be placed on an 8-byte boundary within
-the first 512 megabytes from the start of the kernel image and must not
-cross a 2-megabyte boundary. This is to allow the kernel to map the
-blob using a single section mapping in the initial page tables.
+The device tree blob (dtb) must be placed on an 8-byte boundary and must
+not cross a 2-megabyte boundary. This is to allow the kernel to map the
+blob using a single section mapping in the fixmap region.
 
+NOTE: versions prior to v4.1 require, in addition to the requirements
+listed above, that the dtb be placed above the kernel Image inside the
+same naturally aligned 512 MB region.
 
 3. Decompress the kernel image
 ------------------------------
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 926495686554..6e34347e80e3 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -32,6 +32,15 @@ 
  */
 enum fixed_addresses {
 	FIX_HOLE,
+
+	/*
+	 * Reserve 2 MB of virtual space for the FDT at the top of the fixmap
+	 * region. Keep this at the top so it remains 2 MB aligned.
+	 */
+#define FIX_FDT_SIZE		SZ_2M
+	FIX_FDT_END,
+	FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
+
 	FIX_EARLYCON_MEM_BASE,
 	FIX_TEXT_POKE0,
 	__end_of_permanent_fixed_addresses,
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 5ee07eee80c2..e60885766936 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -6,6 +6,7 @@  CPPFLAGS_vmlinux.lds	:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 CFLAGS_efi-stub.o 	:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 CFLAGS_armv8_deprecated.o := -I$(src)
+CFLAGS_setup.o		:= -I$(srctree)/scripts/dtc/libfdt/
 
 CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_insn.o = -pg
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index d17649d39392..c849cfb157b3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -255,7 +255,6 @@  ENTRY(stext)
 	cbnz	x23, 1f				// invalid processor (x23=0)?
 	b	__error_p
 1:
-	bl	__vet_fdt
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
 	/*
 	 * The following calls CPU specific code in a position independent
@@ -274,24 +273,6 @@  ENTRY(stext)
 ENDPROC(stext)
 
 /*
- * Determine validity of the x21 FDT pointer.
- * The dtb must be 8-byte aligned and live in the first 512M of memory.
- */
-__vet_fdt:
-	tst	x21, #0x7
-	b.ne	1f
-	cmp	x21, x24
-	b.lt	1f
-	mov	x0, #(1 << 29)
-	add	x0, x0, x24
-	cmp	x21, x0
-	b.ge	1f
-	ret
-1:
-	mov	x21, #0
-	ret
-ENDPROC(__vet_fdt)
-/*
  * Macro to create a table entry to the next page.
  *
  *	tbl:	page table address
@@ -352,8 +333,7 @@  ENDPROC(__vet_fdt)
  * required to get the kernel running. The following sections are required:
  *   - identity mapping to enable the MMU (low address, TTBR0)
  *   - first few MB of the kernel linear mapping to jump to once the MMU has
- *     been enabled, including the FDT blob (TTBR1)
- *   - pgd entry for fixed mappings (TTBR1)
+ *     been enabled
  */
 __create_page_tables:
 	pgtbl	x25, x26, x28			// idmap_pg_dir and swapper_pg_dir addresses
@@ -404,22 +384,6 @@  __create_page_tables:
 	create_block_map x0, x7, x3, x5, x6
 
 	/*
-	 * Map the FDT blob (maximum 2MB; must be within 512MB of
-	 * PHYS_OFFSET).
-	 */
-	mov	x3, x21				// FDT phys address
-	and	x3, x3, #~((1 << 21) - 1)	// 2MB aligned
-	mov	x6, #PAGE_OFFSET
-	sub	x5, x3, x24			// subtract PHYS_OFFSET
-	tst	x5, #~((1 << 29) - 1)		// within 512MB?
-	csel	x21, xzr, x21, ne		// zero the FDT pointer
-	b.ne	1f
-	add	x5, x5, x6			// __va(FDT blob)
-	add	x6, x5, #1 << 21		// 2MB for the FDT blob
-	sub	x6, x6, #1			// inclusive range
-	create_block_map x0, x7, x3, x5, x6
-1:
-	/*
 	 * Since the page tables have been populated with non-cacheable
 	 * accesses (MMU disabled), invalidate the idmap and swapper page
 	 * tables again to remove any speculatively loaded cache lines.
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 14808947bf46..ef21f6e172f8 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -45,6 +45,7 @@ 
 #include <linux/of_platform.h>
 #include <linux/efi.h>
 #include <linux/personality.h>
+#include <linux/libfdt.h>
 
 #include <asm/fixmap.h>
 #include <asm/cpu.h>
@@ -108,18 +109,6 @@  static struct resource mem_res[] = {
 #define kernel_code mem_res[0]
 #define kernel_data mem_res[1]
 
-void __init early_print(const char *str, ...)
-{
-	char buf[256];
-	va_list ap;
-
-	va_start(ap, str);
-	vsnprintf(buf, sizeof(buf), str, ap);
-	va_end(ap);
-
-	printk("%s", buf);
-}
-
 void __init smp_setup_processor_id(void)
 {
 	u64 mpidr = read_cpuid_mpidr() & MPIDR_HWID_BITMASK;
@@ -332,19 +321,62 @@  static void __init setup_processor(void)
 #endif
 }
 
+static void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
+	phys_addr_t dt_phys_base = dt_phys & ~(FIX_FDT_SIZE - 1);
+
+	/*
+	 * Make sure that the FDT region can be mapped without the need to
+	 * allocate additional translation table pages, so that it is safe
+	 * to call create_pgd_mapping() this early.
+	 * On 4k pages, we'll use section mappings for the region so we
+	 * only have to be in the same PUD as the rest of the fixmap.
+	 * On 64k pages, we need to be in the same PMD as well, as the region
+	 * will be mapped using PTEs.
+	 */
+	BUILD_BUG_ON(dt_virt_base & (SZ_2M - 1));
+
+	if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
+		BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PMD_SHIFT !=
+			     __fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT);
+	else
+		BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PUD_SHIFT !=
+			     __fix_to_virt(FIX_BTMAP_BEGIN) >> PUD_SHIFT);
+
+	create_pgd_mapping(&init_mm, dt_phys_base, dt_virt_base, FIX_FDT_SIZE,
+			   PAGE_KERNEL | PTE_RDONLY);
+
+	return (void *)(dt_virt_base + dt_phys - dt_phys_base);
+}
+
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
 {
-	if (!dt_phys || !early_init_dt_scan(phys_to_virt(dt_phys))) {
-		early_print("\n"
-			"Error: invalid device tree blob at physical address 0x%p (virtual address 0x%p)\n"
-			"The dtb must be 8-byte aligned and passed in the first 512MB of memory\n"
+	void *dt_virt = NULL;
+
+	if (dt_phys && (dt_phys & 7) == 0)
+		dt_virt = fixmap_remap_fdt(dt_phys);
+
+	/*
+	 * Before passing the dt_virt pointer to early_init_dt_scan(), we have
+	 * to ensure that the FDT size as reported in the FDT itself does not
+	 * exceed the 2 MB window we just mapped for it.
+	 */
+	if (!dt_virt ||
+	    fdt_check_header(dt_virt) != 0 ||
+	    (dt_phys & (SZ_2M - 1)) + fdt_totalsize(dt_virt) > SZ_2M ||
+	    !early_init_dt_scan(dt_virt)) {
+		pr_crit("\n"
+			"Error: invalid device tree blob at physical address %pa (virtual address 0x%p)\n"
+			"The dtb must be 8-byte aligned and must not cross a 2 MB alignment boundary\n"
 			"\nPlease check your bootloader.\n",
-			dt_phys, phys_to_virt(dt_phys));
+			&dt_phys, dt_virt);
 
 		while (true)
 			cpu_relax();
 	}
 
+	memblock_reserve(dt_phys, fdt_totalsize(dt_virt));
 	dump_stack_set_arch_desc("%s (DT)", of_flat_dt_get_machine_name());
 }
 
@@ -382,6 +414,9 @@  void __init setup_arch(char **cmdline_p)
 {
 	setup_processor();
 
+	early_fixmap_init();
+	early_ioremap_init();
+
 	setup_machine_fdt(__fdt_pointer);
 
 	init_mm.start_code = (unsigned long) _text;
@@ -391,9 +426,6 @@  void __init setup_arch(char **cmdline_p)
 
 	*cmdline_p = boot_command_line;
 
-	early_fixmap_init();
-	early_ioremap_init();
-
 	parse_early_param();
 
 	/*
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index fa2389b0f7f0..ae85da6307bb 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -170,7 +170,6 @@  void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
-	early_init_fdt_reserve_self();
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */