diff mbox

arm64: Introduce execute-only page access permissions

Message ID 1470937490-7375-1-git-send-email-catalin.marinas@arm.com
State Accepted
Commit cab15ce604e550020bb7115b779013b91bcdbc21
Headers show

Commit Message

Catalin Marinas Aug. 11, 2016, 5:44 p.m. UTC
The ARMv8 architecture allows execute-only user permissions by clearing
the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU
implementation without User Access Override (ARMv8.2 onwards) can still
access such page, so execute-only page permission does not protect
against read(2)/write(2) etc. accesses. Systems requiring such
protection must enable features like SECCOMP.

This patch changes the arm64 __P100 and __S100 protection_map[] macros
to the new __PAGE_EXECONLY attributes. A side effect is that
pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't
set. To work around this, the check is done on the PTE_NG bit via the
pte_ng() macro. VM_READ is also checked now for page faults.

Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

---
 arch/arm64/include/asm/pgtable-prot.h |  5 +++--
 arch/arm64/include/asm/pgtable.h      | 10 +++++-----
 arch/arm64/mm/fault.c                 |  5 ++---
 mm/mmap.c                             |  5 +++++
 4 files changed, 15 insertions(+), 10 deletions(-)

Comments

Catalin Marinas Aug. 15, 2016, 10:47 a.m. UTC | #1
On Fri, Aug 12, 2016 at 11:23:03AM -0700, Kees Cook wrote:
> On Thu, Aug 11, 2016 at 10:44 AM, Catalin Marinas

> <catalin.marinas@arm.com> wrote:

> > The ARMv8 architecture allows execute-only user permissions by clearing

> > the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU

> > implementation without User Access Override (ARMv8.2 onwards) can still

> > access such page, so execute-only page permission does not protect

> > against read(2)/write(2) etc. accesses. Systems requiring such

> > protection must enable features like SECCOMP.

> 

> So, UAO CPUs will bypass this protection in userspace if using

> read/write on a memory-mapped file?


It's the other way around. CPUs prior to ARMv8.2 (when UAO was
introduced) or with the CONFIG_ARM64_UAO disabled can still access
user execute-only memory regions while running in kernel mode via the
copy_*_user, (get|put)_user etc. routines. So a way user can bypass this
protection is by using such address as argument to read/write file
operations.

I don't think mmap() is an issue since such region is already mapped, so
it would require mprotect(). As for the latter, it would most likely be
restricted (probably together with read/write) SECCOMP.

> I'm just trying to make sure I understand the bypass scenario. And is

> this something that can be fixed? If we add exec-only, I feel like it

> shouldn't have corner case surprises. :)


I think we need better understanding of the usage scenarios for
exec-only. IIUC (from those who first asked me for this feature), it is
an additional protection on top of ASLR to prevent an untrusted entity
from scanning the memory for ROP/JOP gadgets. An instrumented compiler
would avoid generating the literal pool in the same section as the
executable code, thus allowing the instructions to be mapped as
executable-only. It's not clear to me how such untrusted code ends up
scanning the memory, maybe relying on other pre-existent bugs (buffer
under/overflows). I assume if such code is allowed to do system calls,
all bets are off already.

-- 
Catalin
Catalin Marinas Aug. 16, 2016, 4:18 p.m. UTC | #2
On Mon, Aug 15, 2016 at 10:45:09AM -0700, Kees Cook wrote:
> On Mon, Aug 15, 2016 at 3:47 AM, Catalin Marinas

> <catalin.marinas@arm.com> wrote:

> > On Fri, Aug 12, 2016 at 11:23:03AM -0700, Kees Cook wrote:

> >> On Thu, Aug 11, 2016 at 10:44 AM, Catalin Marinas

> >> <catalin.marinas@arm.com> wrote:

> >> > The ARMv8 architecture allows execute-only user permissions by clearing

> >> > the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU

> >> > implementation without User Access Override (ARMv8.2 onwards) can still

> >> > access such page, so execute-only page permission does not protect

> >> > against read(2)/write(2) etc. accesses. Systems requiring such

> >> > protection must enable features like SECCOMP.

> >>

> >> So, UAO CPUs will bypass this protection in userspace if using

> >> read/write on a memory-mapped file?

> >

> > It's the other way around. CPUs prior to ARMv8.2 (when UAO was

> > introduced) or with the CONFIG_ARM64_UAO disabled can still access

> > user execute-only memory regions while running in kernel mode via the

> > copy_*_user, (get|put)_user etc. routines. So a way user can bypass this

> > protection is by using such address as argument to read/write file

> > operations.

> 

> Ah, okay. So exec-only for _userspace_ will always work, but exec-only

> for _kernel_ will only work on ARMv8.2 with CONFIG_ARM64_UAO?


Yes (mostly). With UAO, we changed the user access routines in the
kernel to use the LDTR/STTR instructions which always behave
unprivileged even when executed in kernel mode (unless the UAO bit is
set to override this restriction, needed for set_fs(KERNEL_DS)).

Even with UAO, we still have two cases where the kernel cannot perform
unprivileged accesses (LDTR/STTR) since they don't have an exclusives
equivalent (LDXR/STXR). These are in-user futex atomic ops and the SWP
emulation for 32-bit binaries (armv8_deprecated.c). But these require
write permission, so they would always fault even when running in the
kernel. futex_atomic_cmpxchg_inatomic() is able to return the old value
without a write (if it differs from "oldval") but it doesn't look like
such value could leak to user space.

-- 
Catalin
Will Deacon Aug. 25, 2016, 10:30 a.m. UTC | #3
On Tue, Aug 16, 2016 at 05:18:24PM +0100, Catalin Marinas wrote:
> On Mon, Aug 15, 2016 at 10:45:09AM -0700, Kees Cook wrote:

> > On Mon, Aug 15, 2016 at 3:47 AM, Catalin Marinas

> > <catalin.marinas@arm.com> wrote:

> > > On Fri, Aug 12, 2016 at 11:23:03AM -0700, Kees Cook wrote:

> > >> On Thu, Aug 11, 2016 at 10:44 AM, Catalin Marinas

> > >> <catalin.marinas@arm.com> wrote:

> > >> > The ARMv8 architecture allows execute-only user permissions by clearing

> > >> > the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU

> > >> > implementation without User Access Override (ARMv8.2 onwards) can still

> > >> > access such page, so execute-only page permission does not protect

> > >> > against read(2)/write(2) etc. accesses. Systems requiring such

> > >> > protection must enable features like SECCOMP.

> > >>

> > >> So, UAO CPUs will bypass this protection in userspace if using

> > >> read/write on a memory-mapped file?

> > >

> > > It's the other way around. CPUs prior to ARMv8.2 (when UAO was

> > > introduced) or with the CONFIG_ARM64_UAO disabled can still access

> > > user execute-only memory regions while running in kernel mode via the

> > > copy_*_user, (get|put)_user etc. routines. So a way user can bypass this

> > > protection is by using such address as argument to read/write file

> > > operations.

> > 

> > Ah, okay. So exec-only for _userspace_ will always work, but exec-only

> > for _kernel_ will only work on ARMv8.2 with CONFIG_ARM64_UAO?

> 

> Yes (mostly). With UAO, we changed the user access routines in the

> kernel to use the LDTR/STTR instructions which always behave

> unprivileged even when executed in kernel mode (unless the UAO bit is

> set to override this restriction, needed for set_fs(KERNEL_DS)).

> 

> Even with UAO, we still have two cases where the kernel cannot perform

> unprivileged accesses (LDTR/STTR) since they don't have an exclusives

> equivalent (LDXR/STXR). These are in-user futex atomic ops and the SWP

> emulation for 32-bit binaries (armv8_deprecated.c). But these require

> write permission, so they would always fault even when running in the

> kernel. futex_atomic_cmpxchg_inatomic() is able to return the old value

> without a write (if it differs from "oldval") but it doesn't look like

> such value could leak to user space.


If this was an issue, couldn't we add a dummy LDTR before the LDXR, and
have the fixup handler return -EFAULT?

Either way, this series looks technically fine to me:

Reviewed-by: Will Deacon <will.deacon@arm.com>


but it would be good for some security-focussed person (Hi, Kees!) to
comment on whether or not this is useful, given the caveats you've
described. If it is, I can queue it for 4.9.

Will
diff mbox

Patch

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 39f5252673f7..2142c7726e76 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -70,12 +70,13 @@ 
 #define PAGE_COPY_EXEC		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN)
 #define PAGE_READONLY		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN)
 #define PAGE_READONLY_EXEC	__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN)
+#define PAGE_EXECONLY		__pgprot(_PAGE_DEFAULT | PTE_NG | PTE_PXN)
 
 #define __P000  PAGE_NONE
 #define __P001  PAGE_READONLY
 #define __P010  PAGE_COPY
 #define __P011  PAGE_COPY
-#define __P100  PAGE_READONLY_EXEC
+#define __P100  PAGE_EXECONLY
 #define __P101  PAGE_READONLY_EXEC
 #define __P110  PAGE_COPY_EXEC
 #define __P111  PAGE_COPY_EXEC
@@ -84,7 +85,7 @@ 
 #define __S001  PAGE_READONLY
 #define __S010  PAGE_SHARED
 #define __S011  PAGE_SHARED
-#define __S100  PAGE_READONLY_EXEC
+#define __S100  PAGE_EXECONLY
 #define __S101  PAGE_READONLY_EXEC
 #define __S110  PAGE_SHARED_EXEC
 #define __S111  PAGE_SHARED_EXEC
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index dbb1b7bf1b07..403a61cf4967 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -74,7 +74,7 @@  extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define pte_write(pte)		(!!(pte_val(pte) & PTE_WRITE))
 #define pte_exec(pte)		(!(pte_val(pte) & PTE_UXN))
 #define pte_cont(pte)		(!!(pte_val(pte) & PTE_CONT))
-#define pte_user(pte)		(!!(pte_val(pte) & PTE_USER))
+#define pte_ng(pte)		(!!(pte_val(pte) & PTE_NG))
 
 #ifdef CONFIG_ARM64_HW_AFDBM
 #define pte_hw_dirty(pte)	(pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
@@ -85,8 +85,8 @@  extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define pte_dirty(pte)		(pte_sw_dirty(pte) || pte_hw_dirty(pte))
 
 #define pte_valid(pte)		(!!(pte_val(pte) & PTE_VALID))
-#define pte_valid_not_user(pte) \
-	((pte_val(pte) & (PTE_VALID | PTE_USER)) == PTE_VALID)
+#define pte_valid_global(pte) \
+	((pte_val(pte) & (PTE_VALID | PTE_NG)) == PTE_VALID)
 #define pte_valid_young(pte) \
 	((pte_val(pte) & (PTE_VALID | PTE_AF)) == (PTE_VALID | PTE_AF))
 
@@ -179,7 +179,7 @@  static inline void set_pte(pte_t *ptep, pte_t pte)
 	 * Only if the new pte is valid and kernel, otherwise TLB maintenance
 	 * or update_mmu_cache() have the necessary barriers.
 	 */
-	if (pte_valid_not_user(pte)) {
+	if (pte_valid_global(pte)) {
 		dsb(ishst);
 		isb();
 	}
@@ -213,7 +213,7 @@  static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 			pte_val(pte) &= ~PTE_RDONLY;
 		else
 			pte_val(pte) |= PTE_RDONLY;
-		if (pte_user(pte) && pte_exec(pte) && !pte_special(pte))
+		if (pte_ng(pte) && pte_exec(pte) && !pte_special(pte))
 			__sync_icache_dcache(pte, addr);
 	}
 
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c8beaa0da7df..58f697fe18b6 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -245,8 +245,7 @@  static int __do_page_fault(struct mm_struct *mm, unsigned long addr,
 good_area:
 	/*
 	 * Check that the permissions on the VMA allow for the fault which
-	 * occurred. If we encountered a write or exec fault, we must have
-	 * appropriate permissions, otherwise we allow any permission.
+	 * occurred.
 	 */
 	if (!(vma->vm_flags & vm_flags)) {
 		fault = VM_FAULT_BADACCESS;
@@ -281,7 +280,7 @@  static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
 	struct task_struct *tsk;
 	struct mm_struct *mm;
 	int fault, sig, code;
-	unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC;
+	unsigned long vm_flags = VM_READ | VM_WRITE;
 	unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
 	if (notify_page_fault(regs, esr))
diff --git a/mm/mmap.c b/mm/mmap.c
index ca9d91bca0d6..69cad562cd00 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -88,6 +88,11 @@  static void unmap_region(struct mm_struct *mm,
  *		w: (no) no	w: (no) no	w: (copy) copy	w: (no) no
  *		x: (no) no	x: (no) yes	x: (no) yes	x: (yes) yes
  *
+ * On arm64, PROT_EXEC has the following behaviour for both MAP_SHARED and
+ * MAP_PRIVATE:
+ *								r: (no) no
+ *								w: (no) no
+ *								x: (yes) yes
  */
 pgprot_t protection_map[16] = {
 	__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,