From patchwork Wed Dec 7 15:49:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 631781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 507E5C63706 for ; Wed, 7 Dec 2022 15:49:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229619AbiLGPtv (ORCPT ); Wed, 7 Dec 2022 10:49:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229746AbiLGPtq (ORCPT ); Wed, 7 Dec 2022 10:49:46 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 040134B777 for ; Wed, 7 Dec 2022 07:49:46 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id p24so17413883plw.1 for ; Wed, 07 Dec 2022 07:49:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=71ca0yPxFF6EaiLKMWlrSqitNHDHlqW3RCkGeX1vy+A=; b=PhU557wYsKo/iDLZ/eBYG10j3SUNufkud5V4AzaP2ICyOtQc995aeL12t/n/DCDOjS NMogyAHS/VfnZLMBDSu6Di4KmFIVVQ0uOhTFzgcacngKlg82FwY5QBCqc7jvGtP83Lsg 0oNoHBlwcyoa3/tfXpapegc2TgwcvzzxnMc4Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=71ca0yPxFF6EaiLKMWlrSqitNHDHlqW3RCkGeX1vy+A=; b=p6pYHtHY87DyIb/qp6Dc8/mBldmFIedR/y+AEtk5oV9oEvrExBC0hi017Lz5O/cdco jrW3D07iu4a+dHlaGB67YbpYbABnybX9PZbAeKfqBSNHY+Dlz1QiUm6ybUQEMGPpeVbD 9lVrW58E1MGDeB9TX4OSgLmpM1tDFwpYIkpCl69eJpnzk+ypeH06OvizV9qejYeUy827 zd3rjKoAav1OmA06ytuTHFAPV9y7qlEe7oYlEsHMfGPvX0braIlGwh0dOMvlvvuuTnxZ 5zvfwHZ2hiSN7lebZpjWeJteQ3kR+i49gWI7P1iftAVe3UuOCWatZWnzmA40lJAujSTR awhw== X-Gm-Message-State: ANoB5pm2rEjtmbvZIAQ/SPieVeaJdlUXK6oFaZjIl1eBgnXYWaTVNx0P djlSEK/47JK+OQ7J9+0+KZiwfQ== X-Google-Smtp-Source: AA0mqf6LlztV8oc9Di/UeVnMjrwTxAuHnQqhNmrhBv6DODQ39jcxdztg/nAsot9XSmGNU3B9Sjff5Q== X-Received: by 2002:a17:902:8343:b0:187:1f0:4579 with SMTP id z3-20020a170902834300b0018701f04579mr748942pln.61.1670428185700; Wed, 07 Dec 2022 07:49:45 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id a9-20020a170902ecc900b0017f7628cbddsm14920934plh.30.2022.12.07.07.49.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Dec 2022 07:49:44 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org Subject: [PATCH v6 1/6] mm/memfd: add F_SEAL_EXEC Date: Wed, 7 Dec 2022 15:49:34 +0000 Message-Id: <20221207154939.2532830-2-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog In-Reply-To: <20221207154939.2532830-1-jeffxu@google.com> References: <20221207154939.2532830-1-jeffxu@google.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Daniel Verkamp The new F_SEAL_EXEC flag will prevent modification of the exec bits: written as traditional octal mask, 0111, or as named flags, S_IXUSR | S_IXGRP | S_IXOTH. Any chmod(2) or similar call that attempts to modify any of these bits after the seal is applied will fail with errno EPERM. This will preserve the execute bits as they are at the time of sealing, so the memfd will become either permanently executable or permanently un-executable. Signed-off-by: Daniel Verkamp Co-developed-by: Jeff Xu Signed-off-by: Jeff Xu Reviewed-by: Kees Cook --- include/uapi/linux/fcntl.h | 1 + mm/memfd.c | 2 ++ mm/shmem.c | 6 ++++++ 3 files changed, 9 insertions(+) diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 2f86b2ad6d7e..e8c07da58c9f 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -43,6 +43,7 @@ #define F_SEAL_GROW 0x0004 /* prevent file from growing */ #define F_SEAL_WRITE 0x0008 /* prevent writes */ #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */ +#define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */ /* (1U << 31) is reserved for signed error codes */ /* diff --git a/mm/memfd.c b/mm/memfd.c index 08f5f8304746..4ebeab94aa74 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -147,6 +147,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *file) } #define F_ALL_SEALS (F_SEAL_SEAL | \ + F_SEAL_EXEC | \ F_SEAL_SHRINK | \ F_SEAL_GROW | \ F_SEAL_WRITE | \ @@ -175,6 +176,7 @@ static int memfd_add_seals(struct file *file, unsigned int seals) * SEAL_SHRINK: Prevent the file from shrinking * SEAL_GROW: Prevent the file from growing * SEAL_WRITE: Prevent write access to the file + * SEAL_EXEC: Prevent modification of the exec bits in the file mode * * As we don't require any trust relationship between two parties, we * must prevent seals from being removed. Therefore, sealing a file diff --git a/mm/shmem.c b/mm/shmem.c index c1d8b8a1aa3b..e18a9cf9d937 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1085,6 +1085,12 @@ static int shmem_setattr(struct user_namespace *mnt_userns, if (error) return error; + if ((info->seals & F_SEAL_EXEC) && (attr->ia_valid & ATTR_MODE)) { + if ((inode->i_mode ^ attr->ia_mode) & 0111) { + return -EPERM; + } + } + if (S_ISREG(inode->i_mode) && (attr->ia_valid & ATTR_SIZE)) { loff_t oldsize = inode->i_size; loff_t newsize = attr->ia_size; From patchwork Wed Dec 7 15:49:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 631780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE691C63703 for ; Wed, 7 Dec 2022 15:49:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229844AbiLGPtw (ORCPT ); Wed, 7 Dec 2022 10:49:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229811AbiLGPtt (ORCPT ); Wed, 7 Dec 2022 10:49:49 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29AED5E3F5 for ; Wed, 7 Dec 2022 07:49:48 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id m4so10463124pls.4 for ; Wed, 07 Dec 2022 07:49:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=j02hcUmpYLv5fRuUpSeXXOL/JCHtO2qkj12Q1X4RZAg=; b=R5BHpK7ZG9aovJ+1jeInMWGBsCCU+DnF25bMLAW45g60cIzG8FFjGMOXE+YWkWnupK kQHfR+uCW3ib+lGzbzFcJTQMcw+b6mTj+yo7v+J3Y5PsUrokernH9eTgmzS5L2ZyHQPx He6qg6g0nzNS1MVsYGfSwJbIUQhYQcP1HAQvE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j02hcUmpYLv5fRuUpSeXXOL/JCHtO2qkj12Q1X4RZAg=; b=uWA48u9SKI+YbZwgWl1Bpw/ceuSI8VCZ7SzblH/Jq1lpBv7vA+AkcEwTLZim/OEnZF S3gZW/MJBMW4Fep2PM6z/CtTfOIwlFzYHW81n5k1RvevIZHuOIa4wiAm3rw+ePNXqPzL PcFhF6SMivBeNhAhf+qtrX4whQrT4pfh2z3f2js36ZAIRJ1aJkDxom2wzRjzA4WoDqru 7VTdY23Uw7Cb0WsL+QlesBrTjBGCIDvu9V+rOplLzRjEx0Blxn1nv1nfukCjzRgfvCnl rvARFQHGagb2WtFY0HJDfaFkpG4dXokv4FaJCWEGZ8ZZ5acf7ojJtmIofl266417mQt5 d8SQ== X-Gm-Message-State: ANoB5pkqYzobbfC+IPiccIMbZs6Sblr/wGkmWtF40kInvlbcscXzBzl7 Il7Av/pcekqhdeeGkyS5THPwqg== X-Google-Smtp-Source: AA0mqf49DtKKipwaoufTRe2LK61YHmgP1blkWyCRZQSQe30lAZaQ8E20zAeYdQETl/N3udGE0IQ+MA== X-Received: by 2002:a17:902:9891:b0:189:2688:c97f with SMTP id s17-20020a170902989100b001892688c97fmr690338plp.50.1670428187656; Wed, 07 Dec 2022 07:49:47 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id a9-20020a170902ecc900b0017f7628cbddsm14920934plh.30.2022.12.07.07.49.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Dec 2022 07:49:46 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org, kernel test robot Subject: [PATCH v6 3/6] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC Date: Wed, 7 Dec 2022 15:49:36 +0000 Message-Id: <20221207154939.2532830-4-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog In-Reply-To: <20221207154939.2532830-1-jeffxu@google.com> References: <20221207154939.2532830-1-jeffxu@google.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Jeff Xu The new MFD_NOEXEC_SEAL and MFD_EXEC flags allows application to set executable bit at creation time (memfd_create). When MFD_NOEXEC_SEAL is set, memfd is created without executable bit (mode:0666), and sealed with F_SEAL_EXEC, so it can't be chmod to be executable (mode: 0777) after creation. when MFD_EXEC flag is set, memfd is created with executable bit (mode:0777), this is the same as the old behavior of memfd_create. The new pid namespaced sysctl vm.memfd_noexec has 3 values: 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like MFD_EXEC was set. 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like MFD_NOEXEC_SEAL was set. 2: memfd_create() without MFD_NOEXEC_SEAL will be rejected. The sysctl allows finer control of memfd_create for old-software that doesn't set the executable bit, for example, a container with vm.memfd_noexec=1 means the old-software will create non-executable memfd by default. Also, the value of memfd_noexec is passed to child namespace at creation time. For example, if the init namespace has vm.memfd_noexec=2, all its children namespaces will be created with 2. Signed-off-by: Jeff Xu Co-developed-by: Daniel Verkamp Signed-off-by: Daniel Verkamp Reported-by: kernel test robot --- include/linux/pid_namespace.h | 19 +++++++++++ include/uapi/linux/memfd.h | 4 +++ kernel/pid_namespace.c | 5 +++ kernel/pid_sysctl.h | 59 +++++++++++++++++++++++++++++++++++ mm/memfd.c | 48 ++++++++++++++++++++++++++-- 5 files changed, 133 insertions(+), 2 deletions(-) create mode 100644 kernel/pid_sysctl.h diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 07481bb87d4e..a4789a7b34a9 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -16,6 +16,21 @@ struct fs_pin; +#if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE) +/* + * sysctl for vm.memfd_noexec + * 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL + * acts like MFD_EXEC was set. + * 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL + * acts like MFD_NOEXEC_SEAL was set. + * 2: memfd_create() without MFD_NOEXEC_SEAL will be + * rejected. + */ +#define MEMFD_NOEXEC_SCOPE_EXEC 0 +#define MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL 1 +#define MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED 2 +#endif + struct pid_namespace { struct idr idr; struct rcu_head rcu; @@ -31,6 +46,10 @@ struct pid_namespace { struct ucounts *ucounts; int reboot; /* group exit code if this pidns was rebooted */ struct ns_common ns; +#if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE) + /* sysctl for vm.memfd_noexec */ + int memfd_noexec_scope; +#endif } __randomize_layout; extern struct pid_namespace init_pid_ns; diff --git a/include/uapi/linux/memfd.h b/include/uapi/linux/memfd.h index 7a8a26751c23..273a4e15dfcf 100644 --- a/include/uapi/linux/memfd.h +++ b/include/uapi/linux/memfd.h @@ -8,6 +8,10 @@ #define MFD_CLOEXEC 0x0001U #define MFD_ALLOW_SEALING 0x0002U #define MFD_HUGETLB 0x0004U +/* not executable and sealed to prevent changing to executable. */ +#define MFD_NOEXEC_SEAL 0x0008U +/* executable */ +#define MFD_EXEC 0x0010U /* * Huge page size encoding when MFD_HUGETLB is specified, and a huge page diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index f4f8cb0435b4..8a98b1af9376 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -23,6 +23,7 @@ #include #include #include +#include "pid_sysctl.h" static DEFINE_MUTEX(pid_caches_mutex); static struct kmem_cache *pid_ns_cachep; @@ -110,6 +111,8 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns ns->ucounts = ucounts; ns->pid_allocated = PIDNS_ADDING; + initialize_memfd_noexec_scope(ns); + return ns; out_free_idr: @@ -455,6 +458,8 @@ static __init int pid_namespaces_init(void) #ifdef CONFIG_CHECKPOINT_RESTORE register_sysctl_paths(kern_path, pid_ns_ctl_table); #endif + + register_pid_ns_sysctl_table_vm(); return 0; } diff --git a/kernel/pid_sysctl.h b/kernel/pid_sysctl.h new file mode 100644 index 000000000000..5986d6493b5b --- /dev/null +++ b/kernel/pid_sysctl.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef LINUX_PID_SYSCTL_H +#define LINUX_PID_SYSCTL_H + +#include + +#if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE) +static inline void initialize_memfd_noexec_scope(struct pid_namespace *ns) +{ + ns->memfd_noexec_scope = + task_active_pid_ns(current)->memfd_noexec_scope; +} + +static int pid_mfd_noexec_dointvec_minmax(struct ctl_table *table, + int write, void *buf, size_t *lenp, loff_t *ppos) +{ + struct pid_namespace *ns = task_active_pid_ns(current); + struct ctl_table table_copy; + + if (write && !capable(CAP_SYS_ADMIN)) + return -EPERM; + + table_copy = *table; + if (ns != &init_pid_ns) + table_copy.data = &ns->memfd_noexec_scope; + + /* + * set minimum to current value, the effect is only bigger + * value is accepted. + */ + if (*(int *)table_copy.data > *(int *)table_copy.extra1) + table_copy.extra1 = table_copy.data; + + return proc_dointvec_minmax(&table_copy, write, buf, lenp, ppos); +} + +static struct ctl_table pid_ns_ctl_table_vm[] = { + { + .procname = "memfd_noexec", + .data = &init_pid_ns.memfd_noexec_scope, + .maxlen = sizeof(init_pid_ns.memfd_noexec_scope), + .mode = 0644, + .proc_handler = pid_mfd_noexec_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_TWO, + }, + { } +}; +static struct ctl_path vm_path[] = { { .procname = "vm", }, { } }; +static inline void register_pid_ns_sysctl_table_vm(void) +{ + register_sysctl_paths(vm_path, pid_ns_ctl_table_vm); +} +#else +static inline void set_memfd_noexec_scope(struct pid_namespace *ns) {} +static inline void register_pid_ns_ctl_table_vm(void) {} +#endif + +#endif /* LINUX_PID_SYSCTL_H */ diff --git a/mm/memfd.c b/mm/memfd.c index 4ebeab94aa74..ec70675a7069 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -18,6 +18,7 @@ #include #include #include +#include #include /* @@ -263,12 +264,14 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned long arg) #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1) #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN) -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB) +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | MFD_NOEXEC_SEAL | MFD_EXEC) SYSCALL_DEFINE2(memfd_create, const char __user *, uname, unsigned int, flags) { + char comm[TASK_COMM_LEN]; + struct pid_namespace *ns; unsigned int *file_seals; struct file *file; int fd, error; @@ -285,6 +288,39 @@ SYSCALL_DEFINE2(memfd_create, return -EINVAL; } + /* Invalid if both EXEC and NOEXEC_SEAL are set.*/ + if ((flags & MFD_EXEC) && (flags & MFD_NOEXEC_SEAL)) + return -EINVAL; + + if (!(flags & (MFD_EXEC | MFD_NOEXEC_SEAL))) { +#ifdef CONFIG_SYSCTL + int sysctl = MEMFD_NOEXEC_SCOPE_EXEC; + + ns = task_active_pid_ns(current); + if (ns) + sysctl = ns->memfd_noexec_scope; + + switch (sysctl) { + case MEMFD_NOEXEC_SCOPE_EXEC: + flags |= MFD_EXEC; + break; + case MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL: + flags |= MFD_NOEXEC_SEAL; + break; + default: + pr_warn_ratelimited( + "memfd_create(): MFD_NOEXEC_SEAL is enforced, pid=%d '%s'\n", + task_pid_nr(current), get_task_comm(comm, current)); + return -EINVAL; + } +#else + flags |= MFD_EXEC; +#endif + pr_warn_ratelimited( + "memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=%d '%s'\n", + task_pid_nr(current), get_task_comm(comm, current)); + } + /* length includes terminating zero */ len = strnlen_user(uname, MFD_NAME_MAX_LEN + 1); if (len <= 0) @@ -328,7 +364,15 @@ SYSCALL_DEFINE2(memfd_create, file->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE; file->f_flags |= O_LARGEFILE; - if (flags & MFD_ALLOW_SEALING) { + if (flags & MFD_NOEXEC_SEAL) { + struct inode *inode = file_inode(file); + + inode->i_mode &= ~0111; + file_seals = memfd_file_seals_ptr(file); + *file_seals &= ~F_SEAL_SEAL; + *file_seals |= F_SEAL_EXEC; + } else if (flags & MFD_ALLOW_SEALING) { + /* MFD_EXEC and MFD_ALLOW_SEALING are set */ file_seals = memfd_file_seals_ptr(file); *file_seals &= ~F_SEAL_SEAL; } From patchwork Wed Dec 7 15:49:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 631779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54298C6370C for ; Wed, 7 Dec 2022 15:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229705AbiLGPt4 (ORCPT ); Wed, 7 Dec 2022 10:49:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229829AbiLGPtt (ORCPT ); Wed, 7 Dec 2022 10:49:49 -0500 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E0E33AC14 for ; Wed, 7 Dec 2022 07:49:49 -0800 (PST) Received: by mail-pj1-x1029.google.com with SMTP id k88-20020a17090a4ce100b00219d0b857bcso2054201pjh.1 for ; Wed, 07 Dec 2022 07:49:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=68tZIB+0aYpPIylSIqK6cc/meXXHAi0u5AcBKUyeGcY=; b=Pgcz8EqFIllSIrbiyT8+zFROC9GpuYiPrYWAWuws/ZAml4A7Ugasa59H8dzt4twoB4 O0cAxxO9Mke403E/Mj+rjoT+GeN92NHh57bpsegW33gdWlb7tQAmt+ewJoJYOEJvEyeH jGBefsgqUB3ZHzuky+3uRwszEnFjlO/ZS+68A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=68tZIB+0aYpPIylSIqK6cc/meXXHAi0u5AcBKUyeGcY=; b=t7ywA0603ugzk9Wg7WMmwYIb0dzXlK/ML9sHSpnmmcEmCg/02blkwnxuzKNlIvh1HH AN5LXfg/2JgVUwz97JWzDZOn9+69L8y67hKME3IIQZrgPc484SQhITyRMlocAGlgWlyN /NMvfYv17XSRNAm4JoOJPAYr3qWI09rD8jx6Do7rt80+dbr/bGlabDyskIMP9Bp6UPZw pe1PrFXm6FC8sHTMEWx0lG07XHnaPhOktsWWKj1p66beD/AAcfviYQSvPXaSbO2m2gXl THC8vWf+f3SkCEZ0HxNpXJVQGExDT8cJ5uzVoZSsBzFNLGhjeXnUbIHPT1kAwWdcBDRv aSjQ== X-Gm-Message-State: ANoB5pmBjyfvwxW8yWrRvI1NqcE/S5Fsxdk/XyNzfvFQOZ692SE2Hx3D quy4ZwAtTMhN3u3qzK4zEn5NVa1VtHv+/xc0 X-Google-Smtp-Source: AA0mqf6RkUVGcJjeQXvXQRSUZJARW3Rj32QYHY9STgcjYfWrJ0sPoKsrx/6HB7CijCJ1h5EjJkg5dg== X-Received: by 2002:a05:6a21:9991:b0:a4:5f8d:805a with SMTP id ve17-20020a056a21999100b000a45f8d805amr1331414pzb.53.1670428188610; Wed, 07 Dec 2022 07:49:48 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id a9-20020a170902ecc900b0017f7628cbddsm14920934plh.30.2022.12.07.07.49.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Dec 2022 07:49:47 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org Subject: [PATCH v6 4/6] mm/memfd: Add write seals when apply SEAL_EXEC to executable memfd Date: Wed, 7 Dec 2022 15:49:37 +0000 Message-Id: <20221207154939.2532830-5-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog In-Reply-To: <20221207154939.2532830-1-jeffxu@google.com> References: <20221207154939.2532830-1-jeffxu@google.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Jeff Xu In order to avoid WX mappings, add F_SEAL_WRITE when apply F_SEAL_EXEC to an executable memfd, so W^X from start. This implys application need to fill the content of the memfd first, after F_SEAL_EXEC is applied, application can no longer modify the content of the memfd. Typically, application seals the memfd right after writing to it. For example: 1. memfd_create(MFD_EXEC). 2. write() code to the memfd. 3. fcntl(F_ADD_SEALS, F_SEAL_EXEC) to convert the memfd to W^X. 4. call exec() on the memfd. Signed-off-by: Jeff Xu Reviewed-by: Kees Cook --- mm/memfd.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memfd.c b/mm/memfd.c index ec70675a7069..92f0a5765f7c 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -222,6 +222,12 @@ static int memfd_add_seals(struct file *file, unsigned int seals) } } + /* + * SEAL_EXEC implys SEAL_WRITE, making W^X from the start. + */ + if (seals & F_SEAL_EXEC && inode->i_mode & 0111) + seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE; + *file_seals |= seals; error = 0;