mbox series

[00/12] Make riscv use THP contpte support for arm64

Message ID 20240508191931.46060-1-alexghiti@rivosinc.com
Headers show
Series Make riscv use THP contpte support for arm64 | expand

Message

Alexandre Ghiti May 8, 2024, 7:19 p.m. UTC
This allows riscv to support napot (riscv equivalent to contpte) THPs by
moving arm64 contpte support into mm, the previous series [1] only merging
riscv and arm64 implementations of hugetlbfs contpte.

riscv contpte specification allows for different contpte sizes, although
only 64KB is supported for now. So in this patchset is implemented the
support of multiple contpte sizes, which introduces a few arch specific
helpers to determine what sizes are supported. Even though only one size
is supported on riscv, the implementation of the multi size support is to
show what it will look like when we support other sizes, and make sure
it does not regress arm64.

I tested arm64 using the cow kselftest and a kernel build with 4KB base
page size and 64KB contpte. riscv was tested with the same tests on *all*
contpte sizes that fit in the last page table level (support for PMD sizes
is not present here). Both arch were only tested on qemu.

Alexandre Ghiti (12):
  mm, arm64: Rename ARM64_CONTPTE to THP_CONTPTE
  mm, riscv, arm64: Use common ptep_get() function
  mm, riscv, arm64: Use common set_ptes() function
  mm, riscv, arm64: Use common ptep_get_lockless() function
  mm, riscv, arm64: Use common set_pte() function
  mm, riscv, arm64: Use common pte_clear() function
  mm, riscv, arm64: Use common ptep_get_and_clear() function
  mm, riscv, arm64: Use common ptep_test_and_clear_young() function
  mm, riscv, arm64: Use common ptep_clear_flush_young() function
  mm, riscv, arm64: Use common ptep_set_access_flags() function
  mm, riscv, arm64: Use common ptep_set_wrprotect()/wrprotect_ptes()
    functions
  mm, riscv, arm64: Use common
    get_and_clear_full_ptes()/clear_full_ptes() functions

 arch/arm64/Kconfig               |   9 -
 arch/arm64/include/asm/pgtable.h | 318 +++++---------
 arch/arm64/mm/Makefile           |   1 -
 arch/arm64/mm/contpte.c          | 408 ------------------
 arch/arm64/mm/hugetlbpage.c      |   6 +-
 arch/arm64/mm/mmu.c              |   2 +-
 arch/riscv/include/asm/kfence.h  |   4 +-
 arch/riscv/include/asm/pgtable.h | 206 +++++++++-
 arch/riscv/kernel/efi.c          |   4 +-
 arch/riscv/kernel/hibernate.c    |   2 +-
 arch/riscv/kvm/mmu.c             |  26 +-
 arch/riscv/mm/fault.c            |   2 +-
 arch/riscv/mm/init.c             |   4 +-
 arch/riscv/mm/kasan_init.c       |  16 +-
 arch/riscv/mm/pageattr.c         |   8 +-
 arch/riscv/mm/pgtable.c          |   6 +-
 include/linux/contpte.h          |  37 ++
 mm/Kconfig                       |   9 +
 mm/contpte.c                     | 685 ++++++++++++++++++++++++++++++-
 19 files changed, 1056 insertions(+), 697 deletions(-)
 delete mode 100644 arch/arm64/mm/contpte.c
 create mode 100644 include/linux/contpte.h

Comments

Barry Song May 9, 2024, 12:46 a.m. UTC | #1
On Thu, May 9, 2024 at 7:20 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>
> The ARM64_CONTPTE config represents the capability to transparently use
> contpte mappings for THP userspace mappings, which will be implemented
> in the next commits for riscv, so make this config more generic and move
> it to mm.
>
> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> ---
>  arch/arm64/Kconfig               | 9 ---------
>  arch/arm64/include/asm/pgtable.h | 6 +++---
>  arch/arm64/mm/Makefile           | 2 +-
>  mm/Kconfig                       | 9 +++++++++
>  4 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index ac2f6d906cc3..9d823015b4e5 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2227,15 +2227,6 @@ config UNWIND_PATCH_PAC_INTO_SCS
>         select UNWIND_TABLES
>         select DYNAMIC_SCS
>
> -config ARM64_CONTPTE
> -       bool "Contiguous PTE mappings for user memory" if EXPERT
> -       depends on TRANSPARENT_HUGEPAGE
> -       default y
> -       help
> -         When enabled, user mappings are configured using the PTE contiguous
> -         bit, for any mappings that meet the size and alignment requirements.
> -         This reduces TLB pressure and improves performance.
> -
>  endmenu # "Kernel Features"
>
>  menu "Boot options"
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 7c2938cb70b9..1758ce71fae9 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1369,7 +1369,7 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma,
>                                     unsigned long addr, pte_t *ptep,
>                                     pte_t old_pte, pte_t new_pte);
>
> -#ifdef CONFIG_ARM64_CONTPTE
> +#ifdef CONFIG_THP_CONTPTE

Is it necessarily THP? can't be hugetlb or others? I feel THP_CONTPTE
isn't a good name.

>
>  /*
>   * The contpte APIs are used to transparently manage the contiguous bit in ptes
> @@ -1622,7 +1622,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
>         return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty);
>  }
>
> -#else /* CONFIG_ARM64_CONTPTE */
> +#else /* CONFIG_THP_CONTPTE */
>
>  #define ptep_get                               __ptep_get
>  #define set_pte                                        __set_pte
> @@ -1642,7 +1642,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
>  #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
>  #define ptep_set_access_flags                  __ptep_set_access_flags
>
> -#endif /* CONFIG_ARM64_CONTPTE */
> +#endif /* CONFIG_THP_CONTPTE */
>
>  int find_num_contig(struct mm_struct *mm, unsigned long addr,
>                     pte_t *ptep, size_t *pgsize);
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 60454256945b..52a1b2082627 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -3,7 +3,7 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
>                                    cache.o copypage.o flush.o \
>                                    ioremap.o mmap.o pgd.o mmu.o \
>                                    context.o proc.o pageattr.o fixmap.o
> -obj-$(CONFIG_ARM64_CONTPTE)    += contpte.o
> +obj-$(CONFIG_THP_CONTPTE)      += contpte.o
>  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
>  obj-$(CONFIG_PTDUMP_CORE)      += ptdump.o
>  obj-$(CONFIG_PTDUMP_DEBUGFS)   += ptdump_debugfs.o
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c325003d6552..fd4de221a1c6 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -984,6 +984,15 @@ config ARCH_HAS_CACHE_LINE_SIZE
>  config ARCH_HAS_CONTPTE
>         bool
>
> +config THP_CONTPTE
> +       bool "Contiguous PTE mappings for user memory" if EXPERT
> +       depends on ARCH_HAS_CONTPTE && TRANSPARENT_HUGEPAGE
> +       default y
> +       help
> +         When enabled, user mappings are configured using the PTE contiguous
> +         bit, for any mappings that meet the size and alignment requirements.
> +         This reduces TLB pressure and improves performance.
> +
>  config ARCH_HAS_CURRENT_STACK_POINTER
>         bool
>         help
> --
> 2.39.2

Thanks
Barry
Alexandre Ghiti May 13, 2024, 1:09 p.m. UTC | #2
Hi Barry,

On Thu, May 9, 2024 at 2:46 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Thu, May 9, 2024 at 7:20 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
> >
> > The ARM64_CONTPTE config represents the capability to transparently use
> > contpte mappings for THP userspace mappings, which will be implemented
> > in the next commits for riscv, so make this config more generic and move
> > it to mm.
> >
> > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > ---
> >  arch/arm64/Kconfig               | 9 ---------
> >  arch/arm64/include/asm/pgtable.h | 6 +++---
> >  arch/arm64/mm/Makefile           | 2 +-
> >  mm/Kconfig                       | 9 +++++++++
> >  4 files changed, 13 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index ac2f6d906cc3..9d823015b4e5 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -2227,15 +2227,6 @@ config UNWIND_PATCH_PAC_INTO_SCS
> >         select UNWIND_TABLES
> >         select DYNAMIC_SCS
> >
> > -config ARM64_CONTPTE
> > -       bool "Contiguous PTE mappings for user memory" if EXPERT
> > -       depends on TRANSPARENT_HUGEPAGE
> > -       default y
> > -       help
> > -         When enabled, user mappings are configured using the PTE contiguous
> > -         bit, for any mappings that meet the size and alignment requirements.
> > -         This reduces TLB pressure and improves performance.
> > -
> >  endmenu # "Kernel Features"
> >
> >  menu "Boot options"
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index 7c2938cb70b9..1758ce71fae9 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -1369,7 +1369,7 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma,
> >                                     unsigned long addr, pte_t *ptep,
> >                                     pte_t old_pte, pte_t new_pte);
> >
> > -#ifdef CONFIG_ARM64_CONTPTE
> > +#ifdef CONFIG_THP_CONTPTE
>
> Is it necessarily THP? can't be hugetlb or others? I feel THP_CONTPTE
> isn't a good name.

This does not target hugetlbfs (see my other patchset for that here
https://lore.kernel.org/linux-riscv/7504a525-8211-48b3-becb-a6e838c1b42e@arm.com/T/#m57d273d680fc531b3aa1074e6f8558a52ba5badc).

What could be "others" here?

Thanks for your comment,

Alex

>
> >
> >  /*
> >   * The contpte APIs are used to transparently manage the contiguous bit in ptes
> > @@ -1622,7 +1622,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
> >         return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty);
> >  }
> >
> > -#else /* CONFIG_ARM64_CONTPTE */
> > +#else /* CONFIG_THP_CONTPTE */
> >
> >  #define ptep_get                               __ptep_get
> >  #define set_pte                                        __set_pte
> > @@ -1642,7 +1642,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
> >  #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
> >  #define ptep_set_access_flags                  __ptep_set_access_flags
> >
> > -#endif /* CONFIG_ARM64_CONTPTE */
> > +#endif /* CONFIG_THP_CONTPTE */
> >
> >  int find_num_contig(struct mm_struct *mm, unsigned long addr,
> >                     pte_t *ptep, size_t *pgsize);
> > diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> > index 60454256945b..52a1b2082627 100644
> > --- a/arch/arm64/mm/Makefile
> > +++ b/arch/arm64/mm/Makefile
> > @@ -3,7 +3,7 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
> >                                    cache.o copypage.o flush.o \
> >                                    ioremap.o mmap.o pgd.o mmu.o \
> >                                    context.o proc.o pageattr.o fixmap.o
> > -obj-$(CONFIG_ARM64_CONTPTE)    += contpte.o
> > +obj-$(CONFIG_THP_CONTPTE)      += contpte.o
> >  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
> >  obj-$(CONFIG_PTDUMP_CORE)      += ptdump.o
> >  obj-$(CONFIG_PTDUMP_DEBUGFS)   += ptdump_debugfs.o
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index c325003d6552..fd4de221a1c6 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -984,6 +984,15 @@ config ARCH_HAS_CACHE_LINE_SIZE
> >  config ARCH_HAS_CONTPTE
> >         bool
> >
> > +config THP_CONTPTE
> > +       bool "Contiguous PTE mappings for user memory" if EXPERT
> > +       depends on ARCH_HAS_CONTPTE && TRANSPARENT_HUGEPAGE
> > +       default y
> > +       help
> > +         When enabled, user mappings are configured using the PTE contiguous
> > +         bit, for any mappings that meet the size and alignment requirements.
> > +         This reduces TLB pressure and improves performance.
> > +
> >  config ARCH_HAS_CURRENT_STACK_POINTER
> >         bool
> >         help
> > --
> > 2.39.2
>
> Thanks
> Barry
Barry Song May 14, 2024, 9:30 a.m. UTC | #3
On Tue, May 14, 2024 at 1:09 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>
> Hi Barry,
>
> On Thu, May 9, 2024 at 2:46 AM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Thu, May 9, 2024 at 7:20 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
> > >
> > > The ARM64_CONTPTE config represents the capability to transparently use
> > > contpte mappings for THP userspace mappings, which will be implemented
> > > in the next commits for riscv, so make this config more generic and move
> > > it to mm.
> > >
> > > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > > ---
> > >  arch/arm64/Kconfig               | 9 ---------
> > >  arch/arm64/include/asm/pgtable.h | 6 +++---
> > >  arch/arm64/mm/Makefile           | 2 +-
> > >  mm/Kconfig                       | 9 +++++++++
> > >  4 files changed, 13 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index ac2f6d906cc3..9d823015b4e5 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -2227,15 +2227,6 @@ config UNWIND_PATCH_PAC_INTO_SCS
> > >         select UNWIND_TABLES
> > >         select DYNAMIC_SCS
> > >
> > > -config ARM64_CONTPTE
> > > -       bool "Contiguous PTE mappings for user memory" if EXPERT
> > > -       depends on TRANSPARENT_HUGEPAGE
> > > -       default y
> > > -       help
> > > -         When enabled, user mappings are configured using the PTE contiguous
> > > -         bit, for any mappings that meet the size and alignment requirements.
> > > -         This reduces TLB pressure and improves performance.
> > > -
> > >  endmenu # "Kernel Features"
> > >
> > >  menu "Boot options"
> > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > > index 7c2938cb70b9..1758ce71fae9 100644
> > > --- a/arch/arm64/include/asm/pgtable.h
> > > +++ b/arch/arm64/include/asm/pgtable.h
> > > @@ -1369,7 +1369,7 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma,
> > >                                     unsigned long addr, pte_t *ptep,
> > >                                     pte_t old_pte, pte_t new_pte);
> > >
> > > -#ifdef CONFIG_ARM64_CONTPTE
> > > +#ifdef CONFIG_THP_CONTPTE
> >
> > Is it necessarily THP? can't be hugetlb or others? I feel THP_CONTPTE
> > isn't a good name.
>
> This does not target hugetlbfs (see my other patchset for that here
> https://lore.kernel.org/linux-riscv/7504a525-8211-48b3-becb-a6e838c1b42e@arm.com/T/#m57d273d680fc531b3aa1074e6f8558a52ba5badc).
>
> What could be "others" here?


I acknowledge that the current focus is on Transparent Huge Pages. However,
many aspects of CONT-PTE appear to be applicable to the mm-core in general.
For example,

/*
 * The below functions constitute the public API that arm64 presents to the
 * core-mm to manipulate PTE entries within their page tables (or at least this
 * is the subset of the API that arm64 needs to implement). These public
 * versions will automatically and transparently apply the contiguous bit where
 * it makes sense to do so. Therefore any users that are contig-aware (e.g.
 * hugetlb, kernel mapper) should NOT use these APIs, but instead use the
 * private versions, which are prefixed with double underscore. All of these
 * APIs except for ptep_get_lockless() are expected to be called with the PTL
 * held. Although the contiguous bit is considered private to the
 * implementation, it is deliberately allowed to leak through the getters (e.g.
 * ptep_get()), back to core code. This is required so that pte_leaf_size() can
 * provide an accurate size for perf_get_pgtable_size(). But this leakage means
 * its possible a pte will be passed to a setter with the contiguous bit set, so
 * we explicitly clear the contiguous bit in those cases to prevent accidentally
 * setting it in the pgtable.
 */

#define ptep_get ptep_get
static inline pte_t ptep_get(pte_t *ptep)
{
        pte_t pte = __ptep_get(ptep);

        if (likely(!pte_valid_cont(pte)))
                return pte;

        return contpte_ptep_get(ptep, pte);
}

Could it possibly be given a more generic name such as "PGTABLE_CONTPTE"?

>
> Thanks for your comment,
>
> Alex
>
> >
> > >
> > >  /*
> > >   * The contpte APIs are used to transparently manage the contiguous bit in ptes
> > > @@ -1622,7 +1622,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
> > >         return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty);
> > >  }
> > >
> > > -#else /* CONFIG_ARM64_CONTPTE */
> > > +#else /* CONFIG_THP_CONTPTE */
> > >
> > >  #define ptep_get                               __ptep_get
> > >  #define set_pte                                        __set_pte
> > > @@ -1642,7 +1642,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
> > >  #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
> > >  #define ptep_set_access_flags                  __ptep_set_access_flags
> > >
> > > -#endif /* CONFIG_ARM64_CONTPTE */
> > > +#endif /* CONFIG_THP_CONTPTE */
> > >
> > >  int find_num_contig(struct mm_struct *mm, unsigned long addr,
> > >                     pte_t *ptep, size_t *pgsize);
> > > diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> > > index 60454256945b..52a1b2082627 100644
> > > --- a/arch/arm64/mm/Makefile
> > > +++ b/arch/arm64/mm/Makefile
> > > @@ -3,7 +3,7 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
> > >                                    cache.o copypage.o flush.o \
> > >                                    ioremap.o mmap.o pgd.o mmu.o \
> > >                                    context.o proc.o pageattr.o fixmap.o
> > > -obj-$(CONFIG_ARM64_CONTPTE)    += contpte.o
> > > +obj-$(CONFIG_THP_CONTPTE)      += contpte.o
> > >  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
> > >  obj-$(CONFIG_PTDUMP_CORE)      += ptdump.o
> > >  obj-$(CONFIG_PTDUMP_DEBUGFS)   += ptdump_debugfs.o
> > > diff --git a/mm/Kconfig b/mm/Kconfig
> > > index c325003d6552..fd4de221a1c6 100644
> > > --- a/mm/Kconfig
> > > +++ b/mm/Kconfig
> > > @@ -984,6 +984,15 @@ config ARCH_HAS_CACHE_LINE_SIZE
> > >  config ARCH_HAS_CONTPTE
> > >         bool
> > >
> > > +config THP_CONTPTE
> > > +       bool "Contiguous PTE mappings for user memory" if EXPERT
> > > +       depends on ARCH_HAS_CONTPTE && TRANSPARENT_HUGEPAGE
> > > +       default y
> > > +       help
> > > +         When enabled, user mappings are configured using the PTE contiguous
> > > +         bit, for any mappings that meet the size and alignment requirements.
> > > +         This reduces TLB pressure and improves performance.
> > > +
> > >  config ARCH_HAS_CURRENT_STACK_POINTER
> > >         bool
> > >         help
> > > --
> > > 2.39.2
> >
Thanks
Barry