diff mbox series

[RFC,RESEND,v2,1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd

Message ID 20250102233255.1180524-2-isaacmanjarres@google.com
State New
Headers show
Series Add file seal to prevent future exec mappings | expand

Commit Message

Isaac Manjarres Jan. 2, 2025, 11:32 p.m. UTC
Android currently uses the ashmem driver [1] for creating shared memory
regions between processes. Ashmem buffers can initially be mapped with
PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
permissions that the buffer can be mapped with.

Processes can remove the ability to map ashmem buffers as executable to
ensure that those buffers cannot be exploited to run unintended code.

For instance, suppose process A allocates a memfd that is meant to be
read and written by itself and another process, call it B.

Process A shares the buffer with process B, but process B injects code
into the buffer, and compromises process A, such that it makes A map
the buffer with PROT_EXEC. This provides an opportunity for process A
to run the code that process B injected into the buffer.

If process A had the ability to seal the buffer against future
executable mappings before sharing the buffer with process B, this
attack would not be possible.

Android is currently trying to replace ashmem with memfd. However, memfd
does not have a provision to permanently remove the ability to map a
buffer as executable, and leaves itself open to the type of attack
described earlier. However, this should be something that can be
achieved via a new file seal.

There are known usecases (e.g. CursorWindow [2]) where a process
maps a buffer with read/write permissions before restricting the buffer
to being mapped as read-only for future mappings.

The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning
that mprotect() can change the mapping to be executable. Therefore,
implementing the seal similar to F_SEAL_WRITE would not be appropriate,
since it would not work with the CursorWindow usecase. This is because
the CursorWindow process restricts the mapping permissions to read-only
after the writable mapping is created. So, adding a file seal for
executable mappings that operates like F_SEAL_WRITE would fail.

Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled
similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can
continue to create a writable mapping initially, and then restrict the
permissions on the buffer to be mappable as read-only by using both
F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is
applied, any calls to mmap() with PROT_EXEC will fail.

[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
[2] https://developer.android.com/reference/android/database/CursorWindow

Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
---
 include/uapi/linux/fcntl.h |  1 +
 mm/memfd.c                 | 39 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

Comments

Lorenzo Stoakes Jan. 8, 2025, 8:54 p.m. UTC | #1
On Thu, Jan 02, 2025 at 03:32:50PM -0800, Isaac J. Manjarres wrote:
> Android currently uses the ashmem driver [1] for creating shared memory
> regions between processes. Ashmem buffers can initially be mapped with
> PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
> ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
> permissions that the buffer can be mapped with.
>
> Processes can remove the ability to map ashmem buffers as executable to
> ensure that those buffers cannot be exploited to run unintended code.
>
> For instance, suppose process A allocates a memfd that is meant to be
> read and written by itself and another process, call it B.
>
> Process A shares the buffer with process B, but process B injects code
> into the buffer, and compromises process A, such that it makes A map
> the buffer with PROT_EXEC. This provides an opportunity for process A
> to run the code that process B injected into the buffer.
>
> If process A had the ability to seal the buffer against future
> executable mappings before sharing the buffer with process B, this
> attack would not be possible.
>
> Android is currently trying to replace ashmem with memfd. However, memfd
> does not have a provision to permanently remove the ability to map a
> buffer as executable, and leaves itself open to the type of attack
> described earlier. However, this should be something that can be
> achieved via a new file seal.
>
> There are known usecases (e.g. CursorWindow [2]) where a process
> maps a buffer with read/write permissions before restricting the buffer
> to being mapped as read-only for future mappings.
>
> The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning
> that mprotect() can change the mapping to be executable. Therefore,
> implementing the seal similar to F_SEAL_WRITE would not be appropriate,
> since it would not work with the CursorWindow usecase. This is because
> the CursorWindow process restricts the mapping permissions to read-only
> after the writable mapping is created. So, adding a file seal for
> executable mappings that operates like F_SEAL_WRITE would fail.
>
> Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled
> similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can
> continue to create a writable mapping initially, and then restrict the
> permissions on the buffer to be mappable as read-only by using both
> F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is
> applied, any calls to mmap() with PROT_EXEC will fail.
>
> [1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
> [2] https://developer.android.com/reference/android/database/CursorWindow
>
> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
> ---
>  include/uapi/linux/fcntl.h |  1 +
>  mm/memfd.c                 | 39 +++++++++++++++++++++++++++++++++++++-
>  2 files changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index 6e6907e63bfc..ef066e524777 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -49,6 +49,7 @@
>  #define F_SEAL_WRITE	0x0008	/* prevent writes */
>  #define F_SEAL_FUTURE_WRITE	0x0010  /* prevent future writes while mapped */
>  #define F_SEAL_EXEC	0x0020  /* prevent chmod modifying exec bits */

Hmm ok I just noticed this... F_SEAL_EXEC is weird then.

It doesn't prevent execution in the same way F_SEAL_WRITE does, nor does it seem
to check or care about VM_MAYEXEC...

It just 'prevents chmod from modifying exec bits'.

I mean lord above haha.

And of course the code for it is in shmem_setattr()...

I have not enough faces to palm or palms to face.

So yes I suppose for any sane exec semantics you'll need something new...

> +#define F_SEAL_FUTURE_EXEC	0x0040 /* prevent future executable mappings */
>  /* (1U << 31) is reserved for signed error codes */
>
>  /*
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 5f5a23c9051d..cfd62454df5e 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -184,6 +184,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *file)
>  }
>
>  #define F_ALL_SEALS (F_SEAL_SEAL | \
> +		     F_SEAL_FUTURE_EXEC |\
>  		     F_SEAL_EXEC | \
>  		     F_SEAL_SHRINK | \
>  		     F_SEAL_GROW | \
> @@ -357,14 +358,50 @@ static int check_write_seal(unsigned long *vm_flags_ptr)
>  	return 0;
>  }
>
> +static inline bool is_exec_sealed(unsigned int seals)

This should say 'future', otherwise this is very confusing vs. F_SEAL_EXEC.

Also no need for inline outside of a header.

> +{
> +	return seals & F_SEAL_FUTURE_EXEC;
> +}
> +
> +static int check_exec_seal(unsigned long *vm_flags_ptr)
> +{
> +	unsigned long vm_flags = *vm_flags_ptr;
> +	unsigned long mask = vm_flags & (VM_SHARED | VM_EXEC);
> +
> +	/* Executability is not a concern for private mappings. */
> +	if (!(mask & VM_SHARED))
> +		return 0;
> +
> +	/*
> +	 * New PROT_EXEC and MAP_SHARED mmaps are not allowed when exec seal
> +	 * is active.
> +	 */
> +	if (mask & VM_EXEC)
> +		return -EPERM;
> +
> +	/*
> +	 * Prevent mprotect() from making an exec-sealed mapping executable in
> +	 * the future.
> +	 */
> +	*vm_flags_ptr &= ~VM_MAYEXEC;
> +
> +	return 0;
> +}
> +
>  int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr)
>  {
>  	int err = 0;
>  	unsigned int *seals_ptr = memfd_file_seals_ptr(file);
>  	unsigned int seals = seals_ptr ? *seals_ptr : 0;
>
> -	if (is_write_sealed(seals))
> +	if (is_write_sealed(seals)) {
>  		err = check_write_seal(vm_flags_ptr);
> +		if (err)
> +			return err;
> +	}
> +
> +	if (is_exec_sealed(seals))
> +		err = check_exec_seal(vm_flags_ptr);
>
>  	return err;
>  }

OK this is actually quite neat now we have everything set up in do_mmap().

I think we probably want some comments to very clearly point out that
F_SEAL_EXEC is a bit crazy and weird and meaningless and this is actually
vaguely sane...

> --
> 2.47.1.613.gc27f4b7a9f-goog
>
diff mbox series

Patch

diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 6e6907e63bfc..ef066e524777 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -49,6 +49,7 @@ 
 #define F_SEAL_WRITE	0x0008	/* prevent writes */
 #define F_SEAL_FUTURE_WRITE	0x0010  /* prevent future writes while mapped */
 #define F_SEAL_EXEC	0x0020  /* prevent chmod modifying exec bits */
+#define F_SEAL_FUTURE_EXEC	0x0040 /* prevent future executable mappings */
 /* (1U << 31) is reserved for signed error codes */
 
 /*
diff --git a/mm/memfd.c b/mm/memfd.c
index 5f5a23c9051d..cfd62454df5e 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -184,6 +184,7 @@  static unsigned int *memfd_file_seals_ptr(struct file *file)
 }
 
 #define F_ALL_SEALS (F_SEAL_SEAL | \
+		     F_SEAL_FUTURE_EXEC |\
 		     F_SEAL_EXEC | \
 		     F_SEAL_SHRINK | \
 		     F_SEAL_GROW | \
@@ -357,14 +358,50 @@  static int check_write_seal(unsigned long *vm_flags_ptr)
 	return 0;
 }
 
+static inline bool is_exec_sealed(unsigned int seals)
+{
+	return seals & F_SEAL_FUTURE_EXEC;
+}
+
+static int check_exec_seal(unsigned long *vm_flags_ptr)
+{
+	unsigned long vm_flags = *vm_flags_ptr;
+	unsigned long mask = vm_flags & (VM_SHARED | VM_EXEC);
+
+	/* Executability is not a concern for private mappings. */
+	if (!(mask & VM_SHARED))
+		return 0;
+
+	/*
+	 * New PROT_EXEC and MAP_SHARED mmaps are not allowed when exec seal
+	 * is active.
+	 */
+	if (mask & VM_EXEC)
+		return -EPERM;
+
+	/*
+	 * Prevent mprotect() from making an exec-sealed mapping executable in
+	 * the future.
+	 */
+	*vm_flags_ptr &= ~VM_MAYEXEC;
+
+	return 0;
+}
+
 int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr)
 {
 	int err = 0;
 	unsigned int *seals_ptr = memfd_file_seals_ptr(file);
 	unsigned int seals = seals_ptr ? *seals_ptr : 0;
 
-	if (is_write_sealed(seals))
+	if (is_write_sealed(seals)) {
 		err = check_write_seal(vm_flags_ptr);
+		if (err)
+			return err;
+	}
+
+	if (is_exec_sealed(seals))
+		err = check_exec_seal(vm_flags_ptr);
 
 	return err;
 }