mbox series

[v3,00/13] Introduce sv48 support without relocatable kernel

Message ID 20211206104657.433304-1-alexandre.ghiti@canonical.com
Headers show
Series Introduce sv48 support without relocatable kernel | expand

Message

Alexandre Ghiti Dec. 6, 2021, 10:46 a.m. UTC
* Please note notable changes in memory layouts and kasan population *

This patchset allows to have a single kernel for sv39 and sv48 without
being relocatable.

The idea comes from Arnd Bergmann who suggested to do the same as x86,
that is mapping the kernel to the end of the address space, which allows
the kernel to be linked at the same address for both sv39 and sv48 and
then does not require to be relocated at runtime.

This implements sv48 support at runtime. The kernel will try to
boot with 4-level page table and will fallback to 3-level if the HW does not
support it. Folding the 4th level into a 3-level page table has almost no
cost at runtime.

Note that kasan region had to be moved to the end of the address space
since its location must be known at compile-time and then be valid for
both sv39 and sv48 (and sv57 that is coming).

Tested on:
  - qemu rv64 sv39: OK
  - qemu rv64 sv48: OK
  - qemu rv64 sv39 + kasan: OK
  - qemu rv64 sv48 + kasan: OK
  - qemu rv32: OK

Changes in v3:
  - Fix SZ_1T, thanks to Atish
  - Fix warning create_pud_mapping, thanks to Atish
  - Fix k210 nommu build, thanks to Atish
  - Fix wrong rebase as noted by Samuel
  - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
  - * Move KASAN next to the kernel: virtual layouts changed and kasan population *

Changes in v2:
  - Rebase onto for-next
  - Fix KASAN
  - Fix stack canary
  - Get completely rid of MAXPHYSMEM configs
  - Add documentation

Alexandre Ghiti (13):
  riscv: Move KASAN mapping next to the kernel mapping
  riscv: Split early kasan mapping to prepare sv48 introduction
  riscv: Introduce functions to switch pt_ops
  riscv: Allow to dynamically define VA_BITS
  riscv: Get rid of MAXPHYSMEM configs
  asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
  riscv: Implement sv48 support
  riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
  riscv: Explicit comment about user virtual address space size
  riscv: Improve virtual kernel memory layout dump
  Documentation: riscv: Add sv48 description to VM layout
  riscv: Initialize thread pointer before calling C functions
  riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN

 Documentation/riscv/vm-layout.rst             |  48 ++-
 arch/riscv/Kconfig                            |  37 +-
 arch/riscv/configs/nommu_k210_defconfig       |   1 -
 .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
 arch/riscv/configs/nommu_virt_defconfig       |   1 -
 arch/riscv/include/asm/csr.h                  |   3 +-
 arch/riscv/include/asm/fixmap.h               |   1
 arch/riscv/include/asm/kasan.h                |  11 +-
 arch/riscv/include/asm/page.h                 |  20 +-
 arch/riscv/include/asm/pgalloc.h              |  40 ++
 arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
 arch/riscv/include/asm/pgtable.h              |  47 +-
 arch/riscv/include/asm/sparsemem.h            |   6 +-
 arch/riscv/kernel/cpu.c                       |  23 +-
 arch/riscv/kernel/head.S                      |   4 +-
 arch/riscv/mm/context.c                       |   4 +-
 arch/riscv/mm/init.c                          | 408 ++++++++++++++----
 arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
 drivers/firmware/efi/libstub/efi-stub.c       |   2
 drivers/pci/controller/pci-xgene.c            |   2 +-
 include/asm-generic/pgalloc.h                 |  24 +-
 include/linux/sizes.h                         |   1
 22 files changed, 833 insertions(+), 209 deletions(-)

--
2.32.0

Comments

Guo Ren Dec. 20, 2021, 9:11 a.m. UTC | #1
On Tue, Dec 7, 2021 at 11:55 AM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> Because of the stack canary feature that reads from the current task
> structure the stack canary value, the thread pointer register "tp" must
> be set before calling any C function from head.S: by chance, setup_vm
Shall we disable -fstack-protector for setup_vm() with __attribute__?
Actually, we've already init tp later.

> and all the functions that it calls does not seem to be part of the
> functions where the canary check is done, but in the following commits,
> some functions will.
>
> Fixes: f2c9699f65557a31 ("riscv: Add STACKPROTECTOR supported")
> Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> ---
>  arch/riscv/kernel/head.S | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index c3c0ed559770..86f7ee3d210d 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -302,6 +302,7 @@ clear_bss_done:
>         REG_S a0, (a2)
>
>         /* Initialize page tables and relocate to virtual addresses */
> +       la tp, init_task
>         la sp, init_thread_union + THREAD_SIZE
>         XIP_FIXUP_OFFSET sp
>  #ifdef CONFIG_BUILTIN_DTB
> --
> 2.32.0
>
Ard Biesheuvel Dec. 20, 2021, 9:17 a.m. UTC | #2
On Mon, 20 Dec 2021 at 10:11, Guo Ren <guoren@kernel.org> wrote:
>
> On Tue, Dec 7, 2021 at 11:55 AM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
> >
> > Because of the stack canary feature that reads from the current task
> > structure the stack canary value, the thread pointer register "tp" must
> > be set before calling any C function from head.S: by chance, setup_vm
> Shall we disable -fstack-protector for setup_vm() with __attribute__?

Don't use __attribute__((optimize())) for that: it is known to be
broken, and documented as debug purposes only in the GCC info pages:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html




> Actually, we've already init tp later.
>
> > and all the functions that it calls does not seem to be part of the
> > functions where the canary check is done, but in the following commits,
> > some functions will.
> >
> > Fixes: f2c9699f65557a31 ("riscv: Add STACKPROTECTOR supported")
> > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> > ---
> >  arch/riscv/kernel/head.S | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> > index c3c0ed559770..86f7ee3d210d 100644
> > --- a/arch/riscv/kernel/head.S
> > +++ b/arch/riscv/kernel/head.S
> > @@ -302,6 +302,7 @@ clear_bss_done:
> >         REG_S a0, (a2)
> >
> >         /* Initialize page tables and relocate to virtual addresses */
> > +       la tp, init_task
> >         la sp, init_thread_union + THREAD_SIZE
> >         XIP_FIXUP_OFFSET sp
> >  #ifdef CONFIG_BUILTIN_DTB
> > --
> > 2.32.0
> >
>
>
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/
Guo Ren Dec. 20, 2021, 1:40 p.m. UTC | #3
On Mon, Dec 20, 2021 at 5:17 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 20 Dec 2021 at 10:11, Guo Ren <guoren@kernel.org> wrote:
> >
> > On Tue, Dec 7, 2021 at 11:55 AM Alexandre Ghiti
> > <alexandre.ghiti@canonical.com> wrote:
> > >
> > > Because of the stack canary feature that reads from the current task
> > > structure the stack canary value, the thread pointer register "tp" must
> > > be set before calling any C function from head.S: by chance, setup_vm
> > Shall we disable -fstack-protector for setup_vm() with __attribute__?
>
> Don't use __attribute__((optimize())) for that: it is known to be
> broken, and documented as debug purposes only in the GCC info pages:
>
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
Oh, thx for the link.

>
>
>
>
> > Actually, we've already init tp later.
> >
> > > and all the functions that it calls does not seem to be part of the
> > > functions where the canary check is done, but in the following commits,
> > > some functions will.
> > >
> > > Fixes: f2c9699f65557a31 ("riscv: Add STACKPROTECTOR supported")
> > > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> > > ---
> > >  arch/riscv/kernel/head.S | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> > > index c3c0ed559770..86f7ee3d210d 100644
> > > --- a/arch/riscv/kernel/head.S
> > > +++ b/arch/riscv/kernel/head.S
> > > @@ -302,6 +302,7 @@ clear_bss_done:
> > >         REG_S a0, (a2)
> > >
> > >         /* Initialize page tables and relocate to virtual addresses */
> > > +       la tp, init_task
> > >         la sp, init_thread_union + THREAD_SIZE
> > >         XIP_FIXUP_OFFSET sp
> > >  #ifdef CONFIG_BUILTIN_DTB
> > > --
> > > 2.32.0
> > >
> >
> >
> > --
> > Best Regards
> >  Guo Ren
> >
> > ML: https://lore.kernel.org/linux-csky/
Palmer Dabbelt Jan. 20, 2022, 4:18 a.m. UTC | #4
On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
> * Please note notable changes in memory layouts and kasan population *
>
> This patchset allows to have a single kernel for sv39 and sv48 without
> being relocatable.
>
> The idea comes from Arnd Bergmann who suggested to do the same as x86,
> that is mapping the kernel to the end of the address space, which allows
> the kernel to be linked at the same address for both sv39 and sv48 and
> then does not require to be relocated at runtime.
>
> This implements sv48 support at runtime. The kernel will try to
> boot with 4-level page table and will fallback to 3-level if the HW does not
> support it. Folding the 4th level into a 3-level page table has almost no
> cost at runtime.
>
> Note that kasan region had to be moved to the end of the address space
> since its location must be known at compile-time and then be valid for
> both sv39 and sv48 (and sv57 that is coming).
>
> Tested on:
>   - qemu rv64 sv39: OK
>   - qemu rv64 sv48: OK
>   - qemu rv64 sv39 + kasan: OK
>   - qemu rv64 sv48 + kasan: OK
>   - qemu rv32: OK
>
> Changes in v3:
>   - Fix SZ_1T, thanks to Atish
>   - Fix warning create_pud_mapping, thanks to Atish
>   - Fix k210 nommu build, thanks to Atish
>   - Fix wrong rebase as noted by Samuel
>   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
>   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
>
> Changes in v2:
>   - Rebase onto for-next
>   - Fix KASAN
>   - Fix stack canary
>   - Get completely rid of MAXPHYSMEM configs
>   - Add documentation
>
> Alexandre Ghiti (13):
>   riscv: Move KASAN mapping next to the kernel mapping
>   riscv: Split early kasan mapping to prepare sv48 introduction
>   riscv: Introduce functions to switch pt_ops
>   riscv: Allow to dynamically define VA_BITS
>   riscv: Get rid of MAXPHYSMEM configs
>   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
>   riscv: Implement sv48 support
>   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
>   riscv: Explicit comment about user virtual address space size
>   riscv: Improve virtual kernel memory layout dump
>   Documentation: riscv: Add sv48 description to VM layout
>   riscv: Initialize thread pointer before calling C functions
>   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
>
>  Documentation/riscv/vm-layout.rst             |  48 ++-
>  arch/riscv/Kconfig                            |  37 +-
>  arch/riscv/configs/nommu_k210_defconfig       |   1 -
>  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
>  arch/riscv/configs/nommu_virt_defconfig       |   1 -
>  arch/riscv/include/asm/csr.h                  |   3 +-
>  arch/riscv/include/asm/fixmap.h               |   1
>  arch/riscv/include/asm/kasan.h                |  11 +-
>  arch/riscv/include/asm/page.h                 |  20 +-
>  arch/riscv/include/asm/pgalloc.h              |  40 ++
>  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
>  arch/riscv/include/asm/pgtable.h              |  47 +-
>  arch/riscv/include/asm/sparsemem.h            |   6 +-
>  arch/riscv/kernel/cpu.c                       |  23 +-
>  arch/riscv/kernel/head.S                      |   4 +-
>  arch/riscv/mm/context.c                       |   4 +-
>  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
>  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
>  drivers/firmware/efi/libstub/efi-stub.c       |   2
>  drivers/pci/controller/pci-xgene.c            |   2 +-
>  include/asm-generic/pgalloc.h                 |  24 +-
>  include/linux/sizes.h                         |   1
>  22 files changed, 833 insertions(+), 209 deletions(-)

Sorry this took a while.  This is on for-next, with a bit of juggling: a 
handful of trivial fixes for configs that were failing to build/boot and 
some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so 
it'd be easier to backport.  This is bigger than something I'd normally like to
take late in the cycle, but given there's a lot of cleanups, likely some fixes,
and it looks like folks have been testing this I'm just going to go with it.

Let me know if there's any issues with the merge, it was a bit hairy.  
Probably best to just send along a fixup patch at this point.

Thanks!
Alexandre Ghiti Jan. 20, 2022, 7:30 a.m. UTC | #5
On Thu, Jan 20, 2022 at 5:18 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
> > * Please note notable changes in memory layouts and kasan population *
> >
> > This patchset allows to have a single kernel for sv39 and sv48 without
> > being relocatable.
> >
> > The idea comes from Arnd Bergmann who suggested to do the same as x86,
> > that is mapping the kernel to the end of the address space, which allows
> > the kernel to be linked at the same address for both sv39 and sv48 and
> > then does not require to be relocated at runtime.
> >
> > This implements sv48 support at runtime. The kernel will try to
> > boot with 4-level page table and will fallback to 3-level if the HW does not
> > support it. Folding the 4th level into a 3-level page table has almost no
> > cost at runtime.
> >
> > Note that kasan region had to be moved to the end of the address space
> > since its location must be known at compile-time and then be valid for
> > both sv39 and sv48 (and sv57 that is coming).
> >
> > Tested on:
> >   - qemu rv64 sv39: OK
> >   - qemu rv64 sv48: OK
> >   - qemu rv64 sv39 + kasan: OK
> >   - qemu rv64 sv48 + kasan: OK
> >   - qemu rv32: OK
> >
> > Changes in v3:
> >   - Fix SZ_1T, thanks to Atish
> >   - Fix warning create_pud_mapping, thanks to Atish
> >   - Fix k210 nommu build, thanks to Atish
> >   - Fix wrong rebase as noted by Samuel
> >   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
> >   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
> >
> > Changes in v2:
> >   - Rebase onto for-next
> >   - Fix KASAN
> >   - Fix stack canary
> >   - Get completely rid of MAXPHYSMEM configs
> >   - Add documentation
> >
> > Alexandre Ghiti (13):
> >   riscv: Move KASAN mapping next to the kernel mapping
> >   riscv: Split early kasan mapping to prepare sv48 introduction
> >   riscv: Introduce functions to switch pt_ops
> >   riscv: Allow to dynamically define VA_BITS
> >   riscv: Get rid of MAXPHYSMEM configs
> >   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
> >   riscv: Implement sv48 support
> >   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
> >   riscv: Explicit comment about user virtual address space size
> >   riscv: Improve virtual kernel memory layout dump
> >   Documentation: riscv: Add sv48 description to VM layout
> >   riscv: Initialize thread pointer before calling C functions
> >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
> >
> >  Documentation/riscv/vm-layout.rst             |  48 ++-
> >  arch/riscv/Kconfig                            |  37 +-
> >  arch/riscv/configs/nommu_k210_defconfig       |   1 -
> >  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
> >  arch/riscv/configs/nommu_virt_defconfig       |   1 -
> >  arch/riscv/include/asm/csr.h                  |   3 +-
> >  arch/riscv/include/asm/fixmap.h               |   1
> >  arch/riscv/include/asm/kasan.h                |  11 +-
> >  arch/riscv/include/asm/page.h                 |  20 +-
> >  arch/riscv/include/asm/pgalloc.h              |  40 ++
> >  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
> >  arch/riscv/include/asm/pgtable.h              |  47 +-
> >  arch/riscv/include/asm/sparsemem.h            |   6 +-
> >  arch/riscv/kernel/cpu.c                       |  23 +-
> >  arch/riscv/kernel/head.S                      |   4 +-
> >  arch/riscv/mm/context.c                       |   4 +-
> >  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
> >  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
> >  drivers/firmware/efi/libstub/efi-stub.c       |   2
> >  drivers/pci/controller/pci-xgene.c            |   2 +-
> >  include/asm-generic/pgalloc.h                 |  24 +-
> >  include/linux/sizes.h                         |   1
> >  22 files changed, 833 insertions(+), 209 deletions(-)
>
> Sorry this took a while.  This is on for-next, with a bit of juggling: a
> handful of trivial fixes for configs that were failing to build/boot and
> some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so
> it'd be easier to backport.  This is bigger than something I'd normally like to
> take late in the cycle, but given there's a lot of cleanups, likely some fixes,
> and it looks like folks have been testing this I'm just going to go with it.
>

Yes yes yes! That's fantastic news :)

> Let me know if there's any issues with the merge, it was a bit hairy.
> Probably best to just send along a fixup patch at this point.

I'm going to take a look at that now, and I'll fix anything that comes
up quickly :)

Thanks!

Alex

>
> Thanks!
Alexandre Ghiti Jan. 20, 2022, 10:05 a.m. UTC | #6
On Thu, Jan 20, 2022 at 8:30 AM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> On Thu, Jan 20, 2022 at 5:18 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> >
> > On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
> > > * Please note notable changes in memory layouts and kasan population *
> > >
> > > This patchset allows to have a single kernel for sv39 and sv48 without
> > > being relocatable.
> > >
> > > The idea comes from Arnd Bergmann who suggested to do the same as x86,
> > > that is mapping the kernel to the end of the address space, which allows
> > > the kernel to be linked at the same address for both sv39 and sv48 and
> > > then does not require to be relocated at runtime.
> > >
> > > This implements sv48 support at runtime. The kernel will try to
> > > boot with 4-level page table and will fallback to 3-level if the HW does not
> > > support it. Folding the 4th level into a 3-level page table has almost no
> > > cost at runtime.
> > >
> > > Note that kasan region had to be moved to the end of the address space
> > > since its location must be known at compile-time and then be valid for
> > > both sv39 and sv48 (and sv57 that is coming).
> > >
> > > Tested on:
> > >   - qemu rv64 sv39: OK
> > >   - qemu rv64 sv48: OK
> > >   - qemu rv64 sv39 + kasan: OK
> > >   - qemu rv64 sv48 + kasan: OK
> > >   - qemu rv32: OK
> > >
> > > Changes in v3:
> > >   - Fix SZ_1T, thanks to Atish
> > >   - Fix warning create_pud_mapping, thanks to Atish
> > >   - Fix k210 nommu build, thanks to Atish
> > >   - Fix wrong rebase as noted by Samuel
> > >   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
> > >   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
> > >
> > > Changes in v2:
> > >   - Rebase onto for-next
> > >   - Fix KASAN
> > >   - Fix stack canary
> > >   - Get completely rid of MAXPHYSMEM configs
> > >   - Add documentation
> > >
> > > Alexandre Ghiti (13):
> > >   riscv: Move KASAN mapping next to the kernel mapping
> > >   riscv: Split early kasan mapping to prepare sv48 introduction
> > >   riscv: Introduce functions to switch pt_ops
> > >   riscv: Allow to dynamically define VA_BITS
> > >   riscv: Get rid of MAXPHYSMEM configs
> > >   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
> > >   riscv: Implement sv48 support
> > >   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
> > >   riscv: Explicit comment about user virtual address space size
> > >   riscv: Improve virtual kernel memory layout dump
> > >   Documentation: riscv: Add sv48 description to VM layout
> > >   riscv: Initialize thread pointer before calling C functions
> > >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
> > >
> > >  Documentation/riscv/vm-layout.rst             |  48 ++-
> > >  arch/riscv/Kconfig                            |  37 +-
> > >  arch/riscv/configs/nommu_k210_defconfig       |   1 -
> > >  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
> > >  arch/riscv/configs/nommu_virt_defconfig       |   1 -
> > >  arch/riscv/include/asm/csr.h                  |   3 +-
> > >  arch/riscv/include/asm/fixmap.h               |   1
> > >  arch/riscv/include/asm/kasan.h                |  11 +-
> > >  arch/riscv/include/asm/page.h                 |  20 +-
> > >  arch/riscv/include/asm/pgalloc.h              |  40 ++
> > >  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
> > >  arch/riscv/include/asm/pgtable.h              |  47 +-
> > >  arch/riscv/include/asm/sparsemem.h            |   6 +-
> > >  arch/riscv/kernel/cpu.c                       |  23 +-
> > >  arch/riscv/kernel/head.S                      |   4 +-
> > >  arch/riscv/mm/context.c                       |   4 +-
> > >  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
> > >  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
> > >  drivers/firmware/efi/libstub/efi-stub.c       |   2
> > >  drivers/pci/controller/pci-xgene.c            |   2 +-
> > >  include/asm-generic/pgalloc.h                 |  24 +-
> > >  include/linux/sizes.h                         |   1
> > >  22 files changed, 833 insertions(+), 209 deletions(-)
> >
> > Sorry this took a while.  This is on for-next, with a bit of juggling: a
> > handful of trivial fixes for configs that were failing to build/boot and
> > some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so
> > it'd be easier to backport.  This is bigger than something I'd normally like to
> > take late in the cycle, but given there's a lot of cleanups, likely some fixes,
> > and it looks like folks have been testing this I'm just going to go with it.
> >
>
> Yes yes yes! That's fantastic news :)
>
> > Let me know if there's any issues with the merge, it was a bit hairy.
> > Probably best to just send along a fixup patch at this point.
>
> I'm going to take a look at that now, and I'll fix anything that comes
> up quickly :)

I see in for-next that you did not take the following patches:

  riscv: Improve virtual kernel memory layout dump
  Documentation: riscv: Add sv48 description to VM layout
  riscv: Initialize thread pointer before calling C functions
  riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN

I'm not sure this was your intention. If it was, I believe that at
least the first 2 patches are needed in this series, the 3rd one is a
useful fix and we can discuss the 4th if that's an issue for you.

I tested for-next on both sv39 and sv48 successfully, I took a glance
at the code and noticed you fixed the PTRS_PER_PGD error, thanks for
that. Otherwise nothing obvious has popped.

Thanks again,

Alex

>
> Thanks!
>
> Alex
>
> >
> > Thanks!
Alexandre Ghiti Feb. 18, 2022, 10:45 a.m. UTC | #7
Hi Palmer,

On Thu, Jan 20, 2022 at 11:05 AM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> On Thu, Jan 20, 2022 at 8:30 AM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
> >
> > On Thu, Jan 20, 2022 at 5:18 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> > >
> > > On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
> > > > * Please note notable changes in memory layouts and kasan population *
> > > >
> > > > This patchset allows to have a single kernel for sv39 and sv48 without
> > > > being relocatable.
> > > >
> > > > The idea comes from Arnd Bergmann who suggested to do the same as x86,
> > > > that is mapping the kernel to the end of the address space, which allows
> > > > the kernel to be linked at the same address for both sv39 and sv48 and
> > > > then does not require to be relocated at runtime.
> > > >
> > > > This implements sv48 support at runtime. The kernel will try to
> > > > boot with 4-level page table and will fallback to 3-level if the HW does not
> > > > support it. Folding the 4th level into a 3-level page table has almost no
> > > > cost at runtime.
> > > >
> > > > Note that kasan region had to be moved to the end of the address space
> > > > since its location must be known at compile-time and then be valid for
> > > > both sv39 and sv48 (and sv57 that is coming).
> > > >
> > > > Tested on:
> > > >   - qemu rv64 sv39: OK
> > > >   - qemu rv64 sv48: OK
> > > >   - qemu rv64 sv39 + kasan: OK
> > > >   - qemu rv64 sv48 + kasan: OK
> > > >   - qemu rv32: OK
> > > >
> > > > Changes in v3:
> > > >   - Fix SZ_1T, thanks to Atish
> > > >   - Fix warning create_pud_mapping, thanks to Atish
> > > >   - Fix k210 nommu build, thanks to Atish
> > > >   - Fix wrong rebase as noted by Samuel
> > > >   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
> > > >   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
> > > >
> > > > Changes in v2:
> > > >   - Rebase onto for-next
> > > >   - Fix KASAN
> > > >   - Fix stack canary
> > > >   - Get completely rid of MAXPHYSMEM configs
> > > >   - Add documentation
> > > >
> > > > Alexandre Ghiti (13):
> > > >   riscv: Move KASAN mapping next to the kernel mapping
> > > >   riscv: Split early kasan mapping to prepare sv48 introduction
> > > >   riscv: Introduce functions to switch pt_ops
> > > >   riscv: Allow to dynamically define VA_BITS
> > > >   riscv: Get rid of MAXPHYSMEM configs
> > > >   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
> > > >   riscv: Implement sv48 support
> > > >   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
> > > >   riscv: Explicit comment about user virtual address space size
> > > >   riscv: Improve virtual kernel memory layout dump
> > > >   Documentation: riscv: Add sv48 description to VM layout
> > > >   riscv: Initialize thread pointer before calling C functions
> > > >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
> > > >
> > > >  Documentation/riscv/vm-layout.rst             |  48 ++-
> > > >  arch/riscv/Kconfig                            |  37 +-
> > > >  arch/riscv/configs/nommu_k210_defconfig       |   1 -
> > > >  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
> > > >  arch/riscv/configs/nommu_virt_defconfig       |   1 -
> > > >  arch/riscv/include/asm/csr.h                  |   3 +-
> > > >  arch/riscv/include/asm/fixmap.h               |   1
> > > >  arch/riscv/include/asm/kasan.h                |  11 +-
> > > >  arch/riscv/include/asm/page.h                 |  20 +-
> > > >  arch/riscv/include/asm/pgalloc.h              |  40 ++
> > > >  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
> > > >  arch/riscv/include/asm/pgtable.h              |  47 +-
> > > >  arch/riscv/include/asm/sparsemem.h            |   6 +-
> > > >  arch/riscv/kernel/cpu.c                       |  23 +-
> > > >  arch/riscv/kernel/head.S                      |   4 +-
> > > >  arch/riscv/mm/context.c                       |   4 +-
> > > >  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
> > > >  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
> > > >  drivers/firmware/efi/libstub/efi-stub.c       |   2
> > > >  drivers/pci/controller/pci-xgene.c            |   2 +-
> > > >  include/asm-generic/pgalloc.h                 |  24 +-
> > > >  include/linux/sizes.h                         |   1
> > > >  22 files changed, 833 insertions(+), 209 deletions(-)
> > >
> > > Sorry this took a while.  This is on for-next, with a bit of juggling: a
> > > handful of trivial fixes for configs that were failing to build/boot and
> > > some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so
> > > it'd be easier to backport.  This is bigger than something I'd normally like to
> > > take late in the cycle, but given there's a lot of cleanups, likely some fixes,
> > > and it looks like folks have been testing this I'm just going to go with it.
> > >
> >
> > Yes yes yes! That's fantastic news :)
> >
> > > Let me know if there's any issues with the merge, it was a bit hairy.
> > > Probably best to just send along a fixup patch at this point.
> >
> > I'm going to take a look at that now, and I'll fix anything that comes
> > up quickly :)
>
> I see in for-next that you did not take the following patches:
>
>   riscv: Improve virtual kernel memory layout dump
>   Documentation: riscv: Add sv48 description to VM layout
>   riscv: Initialize thread pointer before calling C functions
>   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
>
> I'm not sure this was your intention. If it was, I believe that at
> least the first 2 patches are needed in this series, the 3rd one is a
> useful fix and we can discuss the 4th if that's an issue for you.

Can you confirm that this was intentional and maybe explain the
motivation behind it? Because I see value in those patches.

Thanks,

Alex

>
> I tested for-next on both sv39 and sv48 successfully, I took a glance
> at the code and noticed you fixed the PTRS_PER_PGD error, thanks for
> that. Otherwise nothing obvious has popped.
>
> Thanks again,
>
> Alex
>
> >
> > Thanks!
> >
> > Alex
> >
> > >
> > > Thanks!
Alexandre Ghiti April 1, 2022, 12:56 p.m. UTC | #8
On Fri, Feb 18, 2022 at 11:45 AM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> Hi Palmer,
>
> On Thu, Jan 20, 2022 at 11:05 AM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
> >
> > On Thu, Jan 20, 2022 at 8:30 AM Alexandre Ghiti
> > <alexandre.ghiti@canonical.com> wrote:
> > >
> > > On Thu, Jan 20, 2022 at 5:18 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> > > >
> > > > On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
> > > > > * Please note notable changes in memory layouts and kasan population *
> > > > >
> > > > > This patchset allows to have a single kernel for sv39 and sv48 without
> > > > > being relocatable.
> > > > >
> > > > > The idea comes from Arnd Bergmann who suggested to do the same as x86,
> > > > > that is mapping the kernel to the end of the address space, which allows
> > > > > the kernel to be linked at the same address for both sv39 and sv48 and
> > > > > then does not require to be relocated at runtime.
> > > > >
> > > > > This implements sv48 support at runtime. The kernel will try to
> > > > > boot with 4-level page table and will fallback to 3-level if the HW does not
> > > > > support it. Folding the 4th level into a 3-level page table has almost no
> > > > > cost at runtime.
> > > > >
> > > > > Note that kasan region had to be moved to the end of the address space
> > > > > since its location must be known at compile-time and then be valid for
> > > > > both sv39 and sv48 (and sv57 that is coming).
> > > > >
> > > > > Tested on:
> > > > >   - qemu rv64 sv39: OK
> > > > >   - qemu rv64 sv48: OK
> > > > >   - qemu rv64 sv39 + kasan: OK
> > > > >   - qemu rv64 sv48 + kasan: OK
> > > > >   - qemu rv32: OK
> > > > >
> > > > > Changes in v3:
> > > > >   - Fix SZ_1T, thanks to Atish
> > > > >   - Fix warning create_pud_mapping, thanks to Atish
> > > > >   - Fix k210 nommu build, thanks to Atish
> > > > >   - Fix wrong rebase as noted by Samuel
> > > > >   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
> > > > >   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
> > > > >
> > > > > Changes in v2:
> > > > >   - Rebase onto for-next
> > > > >   - Fix KASAN
> > > > >   - Fix stack canary
> > > > >   - Get completely rid of MAXPHYSMEM configs
> > > > >   - Add documentation
> > > > >
> > > > > Alexandre Ghiti (13):
> > > > >   riscv: Move KASAN mapping next to the kernel mapping
> > > > >   riscv: Split early kasan mapping to prepare sv48 introduction
> > > > >   riscv: Introduce functions to switch pt_ops
> > > > >   riscv: Allow to dynamically define VA_BITS
> > > > >   riscv: Get rid of MAXPHYSMEM configs
> > > > >   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
> > > > >   riscv: Implement sv48 support
> > > > >   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
> > > > >   riscv: Explicit comment about user virtual address space size
> > > > >   riscv: Improve virtual kernel memory layout dump
> > > > >   Documentation: riscv: Add sv48 description to VM layout
> > > > >   riscv: Initialize thread pointer before calling C functions
> > > > >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
> > > > >
> > > > >  Documentation/riscv/vm-layout.rst             |  48 ++-
> > > > >  arch/riscv/Kconfig                            |  37 +-
> > > > >  arch/riscv/configs/nommu_k210_defconfig       |   1 -
> > > > >  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
> > > > >  arch/riscv/configs/nommu_virt_defconfig       |   1 -
> > > > >  arch/riscv/include/asm/csr.h                  |   3 +-
> > > > >  arch/riscv/include/asm/fixmap.h               |   1
> > > > >  arch/riscv/include/asm/kasan.h                |  11 +-
> > > > >  arch/riscv/include/asm/page.h                 |  20 +-
> > > > >  arch/riscv/include/asm/pgalloc.h              |  40 ++
> > > > >  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
> > > > >  arch/riscv/include/asm/pgtable.h              |  47 +-
> > > > >  arch/riscv/include/asm/sparsemem.h            |   6 +-
> > > > >  arch/riscv/kernel/cpu.c                       |  23 +-
> > > > >  arch/riscv/kernel/head.S                      |   4 +-
> > > > >  arch/riscv/mm/context.c                       |   4 +-
> > > > >  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
> > > > >  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
> > > > >  drivers/firmware/efi/libstub/efi-stub.c       |   2
> > > > >  drivers/pci/controller/pci-xgene.c            |   2 +-
> > > > >  include/asm-generic/pgalloc.h                 |  24 +-
> > > > >  include/linux/sizes.h                         |   1
> > > > >  22 files changed, 833 insertions(+), 209 deletions(-)
> > > >
> > > > Sorry this took a while.  This is on for-next, with a bit of juggling: a
> > > > handful of trivial fixes for configs that were failing to build/boot and
> > > > some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so
> > > > it'd be easier to backport.  This is bigger than something I'd normally like to
> > > > take late in the cycle, but given there's a lot of cleanups, likely some fixes,
> > > > and it looks like folks have been testing this I'm just going to go with it.
> > > >
> > >
> > > Yes yes yes! That's fantastic news :)
> > >
> > > > Let me know if there's any issues with the merge, it was a bit hairy.
> > > > Probably best to just send along a fixup patch at this point.
> > >
> > > I'm going to take a look at that now, and I'll fix anything that comes
> > > up quickly :)
> >
> > I see in for-next that you did not take the following patches:
> >
> >   riscv: Improve virtual kernel memory layout dump
> >   Documentation: riscv: Add sv48 description to VM layout
> >   riscv: Initialize thread pointer before calling C functions
> >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
> >
> > I'm not sure this was your intention. If it was, I believe that at
> > least the first 2 patches are needed in this series, the 3rd one is a
> > useful fix and we can discuss the 4th if that's an issue for you.
>
> Can you confirm that this was intentional and maybe explain the
> motivation behind it? Because I see value in those patches.

Palmer,

I read that you were still taking patches for 5.18, so I confirm again
that the patches above are needed IMO.

Maybe even the relocatable series?

Thanks,

Alex

>
> Thanks,
>
> Alex
>
> >
> > I tested for-next on both sv39 and sv48 successfully, I took a glance
> > at the code and noticed you fixed the PTRS_PER_PGD error, thanks for
> > that. Otherwise nothing obvious has popped.
> >
> > Thanks again,
> >
> > Alex
> >
> > >
> > > Thanks!
> > >
> > > Alex
> > >
> > > >
> > > > Thanks!
Palmer Dabbelt April 23, 2022, 1:50 a.m. UTC | #9
On Fri, 01 Apr 2022 05:56:30 PDT (-0700), alexandre.ghiti@canonical.com wrote:
> On Fri, Feb 18, 2022 at 11:45 AM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
>>
>> Hi Palmer,
>>
>> On Thu, Jan 20, 2022 at 11:05 AM Alexandre Ghiti
>> <alexandre.ghiti@canonical.com> wrote:
>> >
>> > On Thu, Jan 20, 2022 at 8:30 AM Alexandre Ghiti
>> > <alexandre.ghiti@canonical.com> wrote:
>> > >
>> > > On Thu, Jan 20, 2022 at 5:18 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>> > > >
>> > > > On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
>> > > > > * Please note notable changes in memory layouts and kasan population *
>> > > > >
>> > > > > This patchset allows to have a single kernel for sv39 and sv48 without
>> > > > > being relocatable.
>> > > > >
>> > > > > The idea comes from Arnd Bergmann who suggested to do the same as x86,
>> > > > > that is mapping the kernel to the end of the address space, which allows
>> > > > > the kernel to be linked at the same address for both sv39 and sv48 and
>> > > > > then does not require to be relocated at runtime.
>> > > > >
>> > > > > This implements sv48 support at runtime. The kernel will try to
>> > > > > boot with 4-level page table and will fallback to 3-level if the HW does not
>> > > > > support it. Folding the 4th level into a 3-level page table has almost no
>> > > > > cost at runtime.
>> > > > >
>> > > > > Note that kasan region had to be moved to the end of the address space
>> > > > > since its location must be known at compile-time and then be valid for
>> > > > > both sv39 and sv48 (and sv57 that is coming).
>> > > > >
>> > > > > Tested on:
>> > > > >   - qemu rv64 sv39: OK
>> > > > >   - qemu rv64 sv48: OK
>> > > > >   - qemu rv64 sv39 + kasan: OK
>> > > > >   - qemu rv64 sv48 + kasan: OK
>> > > > >   - qemu rv32: OK
>> > > > >
>> > > > > Changes in v3:
>> > > > >   - Fix SZ_1T, thanks to Atish
>> > > > >   - Fix warning create_pud_mapping, thanks to Atish
>> > > > >   - Fix k210 nommu build, thanks to Atish
>> > > > >   - Fix wrong rebase as noted by Samuel
>> > > > >   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
>> > > > >   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
>> > > > >
>> > > > > Changes in v2:
>> > > > >   - Rebase onto for-next
>> > > > >   - Fix KASAN
>> > > > >   - Fix stack canary
>> > > > >   - Get completely rid of MAXPHYSMEM configs
>> > > > >   - Add documentation
>> > > > >
>> > > > > Alexandre Ghiti (13):
>> > > > >   riscv: Move KASAN mapping next to the kernel mapping
>> > > > >   riscv: Split early kasan mapping to prepare sv48 introduction
>> > > > >   riscv: Introduce functions to switch pt_ops
>> > > > >   riscv: Allow to dynamically define VA_BITS
>> > > > >   riscv: Get rid of MAXPHYSMEM configs
>> > > > >   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
>> > > > >   riscv: Implement sv48 support
>> > > > >   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
>> > > > >   riscv: Explicit comment about user virtual address space size
>> > > > >   riscv: Improve virtual kernel memory layout dump
>> > > > >   Documentation: riscv: Add sv48 description to VM layout
>> > > > >   riscv: Initialize thread pointer before calling C functions
>> > > > >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
>> > > > >
>> > > > >  Documentation/riscv/vm-layout.rst             |  48 ++-
>> > > > >  arch/riscv/Kconfig                            |  37 +-
>> > > > >  arch/riscv/configs/nommu_k210_defconfig       |   1 -
>> > > > >  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
>> > > > >  arch/riscv/configs/nommu_virt_defconfig       |   1 -
>> > > > >  arch/riscv/include/asm/csr.h                  |   3 +-
>> > > > >  arch/riscv/include/asm/fixmap.h               |   1
>> > > > >  arch/riscv/include/asm/kasan.h                |  11 +-
>> > > > >  arch/riscv/include/asm/page.h                 |  20 +-
>> > > > >  arch/riscv/include/asm/pgalloc.h              |  40 ++
>> > > > >  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
>> > > > >  arch/riscv/include/asm/pgtable.h              |  47 +-
>> > > > >  arch/riscv/include/asm/sparsemem.h            |   6 +-
>> > > > >  arch/riscv/kernel/cpu.c                       |  23 +-
>> > > > >  arch/riscv/kernel/head.S                      |   4 +-
>> > > > >  arch/riscv/mm/context.c                       |   4 +-
>> > > > >  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
>> > > > >  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
>> > > > >  drivers/firmware/efi/libstub/efi-stub.c       |   2
>> > > > >  drivers/pci/controller/pci-xgene.c            |   2 +-
>> > > > >  include/asm-generic/pgalloc.h                 |  24 +-
>> > > > >  include/linux/sizes.h                         |   1
>> > > > >  22 files changed, 833 insertions(+), 209 deletions(-)
>> > > >
>> > > > Sorry this took a while.  This is on for-next, with a bit of juggling: a
>> > > > handful of trivial fixes for configs that were failing to build/boot and
>> > > > some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so
>> > > > it'd be easier to backport.  This is bigger than something I'd normally like to
>> > > > take late in the cycle, but given there's a lot of cleanups, likely some fixes,
>> > > > and it looks like folks have been testing this I'm just going to go with it.
>> > > >
>> > >
>> > > Yes yes yes! That's fantastic news :)
>> > >
>> > > > Let me know if there's any issues with the merge, it was a bit hairy.
>> > > > Probably best to just send along a fixup patch at this point.
>> > >
>> > > I'm going to take a look at that now, and I'll fix anything that comes
>> > > up quickly :)
>> >
>> > I see in for-next that you did not take the following patches:
>> >
>> >   riscv: Improve virtual kernel memory layout dump
>> >   Documentation: riscv: Add sv48 description to VM layout
>> >   riscv: Initialize thread pointer before calling C functions
>> >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
>> >
>> > I'm not sure this was your intention. If it was, I believe that at
>> > least the first 2 patches are needed in this series, the 3rd one is a
>> > useful fix and we can discuss the 4th if that's an issue for you.
>>
>> Can you confirm that this was intentional and maybe explain the
>> motivation behind it? Because I see value in those patches.
>
> Palmer,
>
> I read that you were still taking patches for 5.18, so I confirm again
> that the patches above are needed IMO.

It was too late for this when it was sent (I saw it then, but just got 
around to actually doing the work to sort it out).

It took me a while to figure out exactly what was going on here, but I 
think I remember now: that downgrade patch (and the follow-on I just 
sent) is broken for medlow, because mm/init.c must be built medany 
(which we're using for the mostly-PIC qualities).  I remember being in 
the middle of rebasing/debugging this a while ago, I must have forgotten 
I was in the middle of that and accidentally merged the branch as-is.  
Certainly wasn't trying to silently take half the patch set and leave 
the rest in limbo, that's the wrong way to do things. 

I'm not sure what the right answer is here, but I just sent a patch to 
drop support for medlow.  We'll have to talk about that, for now I 
cleaned up some other minor issues, rearranged that docs and fix to come 
first, and put this at palmer/riscv-sv48.  I think that fix is 
reasonable to take the doc and fix into fixes, then the dump improvement 
on for-next.  We'll have to see what folks think about the medany-only 
kernels, the other option would be to build FDT as medany which seems a 
bit awkward.  

> Maybe even the relocatable series?

Do you mind giving me a pointer?  I'm not sure why I'm so drop-prone 
with your patches, I promise I'm not doing it on purpose.

>
> Thanks,
>
> Alex
>
>>
>> Thanks,
>>
>> Alex
>>
>> >
>> > I tested for-next on both sv39 and sv48 successfully, I took a glance
>> > at the code and noticed you fixed the PTRS_PER_PGD error, thanks for
>> > that. Otherwise nothing obvious has popped.
>> >
>> > Thanks again,
>> >
>> > Alex
>> >
>> > >
>> > > Thanks!
>> > >
>> > > Alex
>> > >
>> > > >
>> > > > Thanks!
Palmer Dabbelt June 2, 2022, 3:43 a.m. UTC | #10
On Fri, 22 Apr 2022 18:50:47 PDT (-0700), Palmer Dabbelt wrote:
> On Fri, 01 Apr 2022 05:56:30 PDT (-0700), alexandre.ghiti@canonical.com wrote:
>> On Fri, Feb 18, 2022 at 11:45 AM Alexandre Ghiti
>> <alexandre.ghiti@canonical.com> wrote:
>>>
>>> Hi Palmer,
>>>
>>> On Thu, Jan 20, 2022 at 11:05 AM Alexandre Ghiti
>>> <alexandre.ghiti@canonical.com> wrote:
>>> >
>>> > On Thu, Jan 20, 2022 at 8:30 AM Alexandre Ghiti
>>> > <alexandre.ghiti@canonical.com> wrote:
>>> > >
>>> > > On Thu, Jan 20, 2022 at 5:18 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>> > > >
>>> > > > On Mon, 06 Dec 2021 02:46:44 PST (-0800), alexandre.ghiti@canonical.com wrote:
>>> > > > > * Please note notable changes in memory layouts and kasan population *
>>> > > > >
>>> > > > > This patchset allows to have a single kernel for sv39 and sv48 without
>>> > > > > being relocatable.
>>> > > > >
>>> > > > > The idea comes from Arnd Bergmann who suggested to do the same as x86,
>>> > > > > that is mapping the kernel to the end of the address space, which allows
>>> > > > > the kernel to be linked at the same address for both sv39 and sv48 and
>>> > > > > then does not require to be relocated at runtime.
>>> > > > >
>>> > > > > This implements sv48 support at runtime. The kernel will try to
>>> > > > > boot with 4-level page table and will fallback to 3-level if the HW does not
>>> > > > > support it. Folding the 4th level into a 3-level page table has almost no
>>> > > > > cost at runtime.
>>> > > > >
>>> > > > > Note that kasan region had to be moved to the end of the address space
>>> > > > > since its location must be known at compile-time and then be valid for
>>> > > > > both sv39 and sv48 (and sv57 that is coming).
>>> > > > >
>>> > > > > Tested on:
>>> > > > >   - qemu rv64 sv39: OK
>>> > > > >   - qemu rv64 sv48: OK
>>> > > > >   - qemu rv64 sv39 + kasan: OK
>>> > > > >   - qemu rv64 sv48 + kasan: OK
>>> > > > >   - qemu rv32: OK
>>> > > > >
>>> > > > > Changes in v3:
>>> > > > >   - Fix SZ_1T, thanks to Atish
>>> > > > >   - Fix warning create_pud_mapping, thanks to Atish
>>> > > > >   - Fix k210 nommu build, thanks to Atish
>>> > > > >   - Fix wrong rebase as noted by Samuel
>>> > > > >   - * Downgrade to sv39 is only possible if !KASAN (see commit changelog) *
>>> > > > >   - * Move KASAN next to the kernel: virtual layouts changed and kasan population *
>>> > > > >
>>> > > > > Changes in v2:
>>> > > > >   - Rebase onto for-next
>>> > > > >   - Fix KASAN
>>> > > > >   - Fix stack canary
>>> > > > >   - Get completely rid of MAXPHYSMEM configs
>>> > > > >   - Add documentation
>>> > > > >
>>> > > > > Alexandre Ghiti (13):
>>> > > > >   riscv: Move KASAN mapping next to the kernel mapping
>>> > > > >   riscv: Split early kasan mapping to prepare sv48 introduction
>>> > > > >   riscv: Introduce functions to switch pt_ops
>>> > > > >   riscv: Allow to dynamically define VA_BITS
>>> > > > >   riscv: Get rid of MAXPHYSMEM configs
>>> > > > >   asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
>>> > > > >   riscv: Implement sv48 support
>>> > > > >   riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
>>> > > > >   riscv: Explicit comment about user virtual address space size
>>> > > > >   riscv: Improve virtual kernel memory layout dump
>>> > > > >   Documentation: riscv: Add sv48 description to VM layout
>>> > > > >   riscv: Initialize thread pointer before calling C functions
>>> > > > >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
>>> > > > >
>>> > > > >  Documentation/riscv/vm-layout.rst             |  48 ++-
>>> > > > >  arch/riscv/Kconfig                            |  37 +-
>>> > > > >  arch/riscv/configs/nommu_k210_defconfig       |   1 -
>>> > > > >  .../riscv/configs/nommu_k210_sdcard_defconfig |   1 -
>>> > > > >  arch/riscv/configs/nommu_virt_defconfig       |   1 -
>>> > > > >  arch/riscv/include/asm/csr.h                  |   3 +-
>>> > > > >  arch/riscv/include/asm/fixmap.h               |   1
>>> > > > >  arch/riscv/include/asm/kasan.h                |  11 +-
>>> > > > >  arch/riscv/include/asm/page.h                 |  20 +-
>>> > > > >  arch/riscv/include/asm/pgalloc.h              |  40 ++
>>> > > > >  arch/riscv/include/asm/pgtable-64.h           | 108 ++++-
>>> > > > >  arch/riscv/include/asm/pgtable.h              |  47 +-
>>> > > > >  arch/riscv/include/asm/sparsemem.h            |   6 +-
>>> > > > >  arch/riscv/kernel/cpu.c                       |  23 +-
>>> > > > >  arch/riscv/kernel/head.S                      |   4 +-
>>> > > > >  arch/riscv/mm/context.c                       |   4 +-
>>> > > > >  arch/riscv/mm/init.c                          | 408 ++++++++++++++----
>>> > > > >  arch/riscv/mm/kasan_init.c                    | 250 ++++++++---
>>> > > > >  drivers/firmware/efi/libstub/efi-stub.c       |   2
>>> > > > >  drivers/pci/controller/pci-xgene.c            |   2 +-
>>> > > > >  include/asm-generic/pgalloc.h                 |  24 +-
>>> > > > >  include/linux/sizes.h                         |   1
>>> > > > >  22 files changed, 833 insertions(+), 209 deletions(-)
>>> > > >
>>> > > > Sorry this took a while.  This is on for-next, with a bit of juggling: a
>>> > > > handful of trivial fixes for configs that were failing to build/boot and
>>> > > > some merge issues.  I also pulled out that MAXPHYSMEM fix to the top, so
>>> > > > it'd be easier to backport.  This is bigger than something I'd normally like to
>>> > > > take late in the cycle, but given there's a lot of cleanups, likely some fixes,
>>> > > > and it looks like folks have been testing this I'm just going to go with it.
>>> > > >
>>> > >
>>> > > Yes yes yes! That's fantastic news :)
>>> > >
>>> > > > Let me know if there's any issues with the merge, it was a bit hairy.
>>> > > > Probably best to just send along a fixup patch at this point.
>>> > >
>>> > > I'm going to take a look at that now, and I'll fix anything that comes
>>> > > up quickly :)
>>> >
>>> > I see in for-next that you did not take the following patches:
>>> >
>>> >   riscv: Improve virtual kernel memory layout dump
>>> >   Documentation: riscv: Add sv48 description to VM layout
>>> >   riscv: Initialize thread pointer before calling C functions
>>> >   riscv: Allow user to downgrade to sv39 when hw supports sv48 if !KASAN
>>> >
>>> > I'm not sure this was your intention. If it was, I believe that at
>>> > least the first 2 patches are needed in this series, the 3rd one is a
>>> > useful fix and we can discuss the 4th if that's an issue for you.
>>>
>>> Can you confirm that this was intentional and maybe explain the
>>> motivation behind it? Because I see value in those patches.
>>
>> Palmer,
>>
>> I read that you were still taking patches for 5.18, so I confirm again
>> that the patches above are needed IMO.
>
> It was too late for this when it was sent (I saw it then, but just got
> around to actually doing the work to sort it out).
>
> It took me a while to figure out exactly what was going on here, but I
> think I remember now: that downgrade patch (and the follow-on I just
> sent) is broken for medlow, because mm/init.c must be built medany
> (which we're using for the mostly-PIC qualities).  I remember being in
> the middle of rebasing/debugging this a while ago, I must have forgotten
> I was in the middle of that and accidentally merged the branch as-is.
> Certainly wasn't trying to silently take half the patch set and leave
> the rest in limbo, that's the wrong way to do things.
>
> I'm not sure what the right answer is here, but I just sent a patch to
> drop support for medlow.  We'll have to talk about that, for now I
> cleaned up some other minor issues, rearranged that docs and fix to come
> first, and put this at palmer/riscv-sv48.  I think that fix is
> reasonable to take the doc and fix into fixes, then the dump improvement
> on for-next.  We'll have to see what folks think about the medany-only
> kernels, the other option would be to build FDT as medany which seems a
> bit awkward.

All but the last one are on for-next, there's some discussion on that 
last one that pointed out some better ways to do it.

>
>> Maybe even the relocatable series?
>
> Do you mind giving me a pointer?  I'm not sure why I'm so drop-prone
> with your patches, I promise I'm not doing it on purpose.
>
>>
>> Thanks,
>>
>> Alex
>>
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>> >
>>> > I tested for-next on both sv39 and sv48 successfully, I took a glance
>>> > at the code and noticed you fixed the PTRS_PER_PGD error, thanks for
>>> > that. Otherwise nothing obvious has popped.
>>> >
>>> > Thanks again,
>>> >
>>> > Alex
>>> >
>>> > >
>>> > > Thanks!
>>> > >
>>> > > Alex
>>> > >
>>> > > >
>>> > > > Thanks!