diff mbox series

[v2,1/2] memfd: fix MFD_NOEXEC_SEAL to be non-sealable by default

Message ID 20240524033933.135049-2-jeffxu@google.com
State New
Headers show
Series [v2,1/2] memfd: fix MFD_NOEXEC_SEAL to be non-sealable by default | expand

Commit Message

Jeff Xu May 24, 2024, 3:39 a.m. UTC
From: Jeff Xu <jeffxu@google.com>

By default, memfd_create() creates a non-sealable MFD, unless the
MFD_ALLOW_SEALING flag is set.

When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created
with that flag is sealable, even though MFD_ALLOW_SEALING is not set.
This patch changes MFD_NOEXEC_SEAL to be non-sealable by default,
unless MFD_ALLOW_SEALING is explicitly set.

This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL
is new, we expect not many applications will rely on the nature of
MFD_NOEXEC_SEAL being sealable. In most cases, the application already
sets MFD_ALLOW_SEALING if they need a sealable MFD.

Additionally, this enhances the useability of  pid namespace sysctl
vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will
add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or
MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD
to be sealable. This means, any application that does not desire this
behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to
migrate/enforce non-executable MFD. This adjustment ensures that
applications can anticipate that the sealable characteristic will
remain unmodified by vm.memfd_noexec.

This patch was initially developed by Barnabás Pőcze, and Barnabás
used Debian Code Search and GitHub to try to find potential breakages
and could only find a single one. Dbus-broker's memfd_create() wrapper
is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to
work around it [1]. This workaround will break. Luckily, this only
affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[2]. In
addition, David Rheinsberg also raised similar fix in [3]

[1]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
[2]: https://github.com/bus1/dbus-broker/pull/366
[3]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/

Cc: stable@vger.kernel.org
Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
Signed-off-by: Jeff Xu <jeffxu@google.com>
Reviewed-by: David Rheinsberg <david@readahead.eu>
---
 mm/memfd.c                                 |  9 ++++----
 tools/testing/selftests/memfd/memfd_test.c | 26 +++++++++++++++++++++-
 2 files changed, 29 insertions(+), 6 deletions(-)

Comments

Barnabás Pőcze May 31, 2024, 6:56 p.m. UTC | #1
2024. május 30., csütörtök 0:24 keltezéssel, Jeff Xu <jeffxu@google.com> írta:

> On Wed, May 29, 2024 at 2:46 PM Barnabás Pőcze <pobrn@protonmail.com> wrote:
> >
> > Hi
> >
> >
> > 2024. május 29., szerda 23:30 keltezéssel, Jeff Xu <jeffxu@google.com> írta:
> >
> > > Hi David and Barnabás
> > >
> > > On Fri, May 24, 2024 at 7:15 AM David Rheinsberg <david@readahead.eu> wrote:
> > > >
> > > > Hi
> > > >
> > > > On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
> > > > > From: Jeff Xu <jeffxu@google.com>
> > > > >
> > > > > By default, memfd_create() creates a non-sealable MFD, unless the
> > > > > MFD_ALLOW_SEALING flag is set.
> > > > >
> > > > > When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created
> > > > > with that flag is sealable, even though MFD_ALLOW_SEALING is not set.
> > > > > This patch changes MFD_NOEXEC_SEAL to be non-sealable by default,
> > > > > unless MFD_ALLOW_SEALING is explicitly set.
> > > > >
> > > > > This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL
> > > > > is new, we expect not many applications will rely on the nature of
> > > > > MFD_NOEXEC_SEAL being sealable. In most cases, the application already
> > > > > sets MFD_ALLOW_SEALING if they need a sealable MFD.
> > > >
> > > > This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
> > > >
> > > >     This is a non-backward compatible change. However, MFD_NOEXEC_SEAL
> > > >     was only recently introduced and a codesearch revealed no breaking
> > > >     users apart from dbus-broker unit-tests (which have a patch pending
> > > >     and explicitly support this change).
> > > >
> > > Actually, I think we might need to hold on to this change. With debian
> > > code search, I found more codes that already use MFD_NOEXEC_SEAL
> > > without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
> >
> > Yes, I have looked at those as well, and as far as I could tell,
> > they are not affected. Have I missed something?
> >
> In the example, the MFD was created then passed into somewhere else
> (safe_fork_full, open_serialization_fd, etc.), the scope and usage of
> mfd isn't that clear to me, you might have checked all the user cases.
> In addition, MFD_NOEXEC_SEAL  exists in libc and rust and go lib.  I
> don't know if debian code search is sufficient to cover enough apps .
> There is a certain risk.
> 
> Fundamentally, I'm not convinced that making MFD default-non-sealable
> has  meaningful benefit, especially when MFD_NOEXEC_SEAL is new.

Certainly, there is always a risk, I did not mean to imply that there isn't.
However, I believe this risk is low enough to at least warrant an attempt at
eliminating this inconsistency. It can always be reverted if it turns out that
the effects have been vastly underestimated by me.

So I would still like to see this change merged.


Regards,
Barnabás Pőcze


> 
> 
> >
> > Regards,
> > Barnabás
> >
> >
> > >
> > > I'm not sure if this  will break  more applications not-knowingly that
> > > have started relying on MFD_NOEXEC_SEAL being sealable. The feature
> > > has been out for more than a year.
> > >
> > > Would you consider my augments in [4] to make MFD to be sealable by default ?
> > >
> > > At this moment, I'm willing to add a document to clarify that
> > > MFD_NOEXEC_SEAL is sealable by default, and that an app that needs
> > > non-sealable MFD can  set  SEAL_SEAL.  Because both MFD_NOEXEC_SEAL
> > > and vm.memfd_noexec are new,  I don't think it breaks the existing
> > > ABI, and vm.memfd_noexec=0 is there for backward compatibility
> > > reasons. Besides, I honestly think there is little reason that MFD
> > > needs to be non-sealable by default.  There might be few rare cases,
> > > but the majority of apps don't need that.  On the flip side, the fact
> > > that MFD is set up to be sealable by default is a nice bonus for an
> > > app - it makes it easier for apps to use the sealing feature.
> > >
> > > What do you think ?
> > >
> > > Thanks
> > > -Jeff
> > >
> > > [1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL
> > > [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhomed-home.c&line=1274
> > > [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c/?hl=558#L558
> > > [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY66Hbw@mail.gmail.com/
> > >
> > >
> > > > > Additionally, this enhances the useability of  pid namespace sysctl
> > > > > vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will
> > > > > add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or
> > > > > MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD
> > > > > to be sealable. This means, any application that does not desire this
> > > > > behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to
> > > > > migrate/enforce non-executable MFD. This adjustment ensures that
> > > > > applications can anticipate that the sealable characteristic will
> > > > > remain unmodified by vm.memfd_noexec.
> > > > >
> > > > > This patch was initially developed by Barnabás Pőcze, and Barnabás
> > > > > used Debian Code Search and GitHub to try to find potential breakages
> > > > > and could only find a single one. Dbus-broker's memfd_create() wrapper
> > > > > is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to
> > > > > work around it [1]. This workaround will break. Luckily, this only
> > > > > affects the test suite, it does not affect
> > > > > the normal operations of dbus-broker. There is a PR with a fix[2]. In
> > > > > addition, David Rheinsberg also raised similar fix in [3]
> > > > >
> > > > > [1]:
> > > > > https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> > > > > [2]: https://github.com/bus1/dbus-broker/pull/366
> > > > > [3]:
> > > > > https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> > > > >
> > > > > Cc: stable@vger.kernel.org
> > > > > Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> > > > > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> > > > > Signed-off-by: Jeff Xu <jeffxu@google.com>
> > > > > Reviewed-by: David Rheinsberg <david@readahead.eu>
> > > >
> > > > Looks good! Thanks!
> > > > David
> > >
Jeff Xu June 7, 2024, 3:59 p.m. UTC | #2
Hi Barnabás

On Fri, May 31, 2024 at 11:56 AM Barnabás Pőcze <pobrn@protonmail.com> wrote:
>
> 2024. május 30., csütörtök 0:24 keltezéssel, Jeff Xu <jeffxu@google.com> írta:
>
> > On Wed, May 29, 2024 at 2:46 PM Barnabás Pőcze <pobrn@protonmail.com> wrote:
> > >
> > > Hi
> > >
> > >
> > > 2024. május 29., szerda 23:30 keltezéssel, Jeff Xu <jeffxu@google.com> írta:
> > >
> > > > Hi David and Barnabás
> > > >
> > > > On Fri, May 24, 2024 at 7:15 AM David Rheinsberg <david@readahead.eu> wrote:
> > > > >
> > > > > Hi
> > > > >
> > > > > On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
> > > > > > From: Jeff Xu <jeffxu@google.com>
> > > > > >
> > > > > > By default, memfd_create() creates a non-sealable MFD, unless the
> > > > > > MFD_ALLOW_SEALING flag is set.
> > > > > >
> > > > > > When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created
> > > > > > with that flag is sealable, even though MFD_ALLOW_SEALING is not set.
> > > > > > This patch changes MFD_NOEXEC_SEAL to be non-sealable by default,
> > > > > > unless MFD_ALLOW_SEALING is explicitly set.
> > > > > >
> > > > > > This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL
> > > > > > is new, we expect not many applications will rely on the nature of
> > > > > > MFD_NOEXEC_SEAL being sealable. In most cases, the application already
> > > > > > sets MFD_ALLOW_SEALING if they need a sealable MFD.
> > > > >
> > > > > This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
> > > > >
> > > > >     This is a non-backward compatible change. However, MFD_NOEXEC_SEAL
> > > > >     was only recently introduced and a codesearch revealed no breaking
> > > > >     users apart from dbus-broker unit-tests (which have a patch pending
> > > > >     and explicitly support this change).
> > > > >
> > > > Actually, I think we might need to hold on to this change. With debian
> > > > code search, I found more codes that already use MFD_NOEXEC_SEAL
> > > > without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
> > >
> > > Yes, I have looked at those as well, and as far as I could tell,
> > > they are not affected. Have I missed something?
> > >
> > In the example, the MFD was created then passed into somewhere else
> > (safe_fork_full, open_serialization_fd, etc.), the scope and usage of
> > mfd isn't that clear to me, you might have checked all the user cases.
> > In addition, MFD_NOEXEC_SEAL  exists in libc and rust and go lib.  I
> > don't know if debian code search is sufficient to cover enough apps .
> > There is a certain risk.
> >
> > Fundamentally, I'm not convinced that making MFD default-non-sealable
> > has  meaningful benefit, especially when MFD_NOEXEC_SEAL is new.
>
> Certainly, there is always a risk, I did not mean to imply that there isn't.
> However, I believe this risk is low enough to at least warrant an attempt at
> eliminating this inconsistency. It can always be reverted if it turns out that
> the effects have been vastly underestimated by me.
>
> So I would still like to see this change merged.
>

The MFD_NOEXEC_SEAL is a new flag, technically, ABI is not broken.
The sysctl vm.memfd_noexec=1 or 2, is meant to help
migration/enforcement of MFD_NOEXEC_SEAL, so it will break application
if it is used pre-maturely, that is by-design.

I think the main problem here is lack of documentation, instead of a code bug.
ABI change shouldn't be treated lightly, given the risk, I would like
to keep the API the same and add the documentation instead. I think
that is the best route forward.

Best Regards,
-Jeff



>
> Regards,
> Barnabás Pőcze
>
>
> >
> >
> > >
> > > Regards,
> > > Barnabás
> > >
> > >
> > > >
> > > > I'm not sure if this  will break  more applications not-knowingly that
> > > > have started relying on MFD_NOEXEC_SEAL being sealable. The feature
> > > > has been out for more than a year.
> > > >
> > > > Would you consider my augments in [4] to make MFD to be sealable by default ?
> > > >
> > > > At this moment, I'm willing to add a document to clarify that
> > > > MFD_NOEXEC_SEAL is sealable by default, and that an app that needs
> > > > non-sealable MFD can  set  SEAL_SEAL.  Because both MFD_NOEXEC_SEAL
> > > > and vm.memfd_noexec are new,  I don't think it breaks the existing
> > > > ABI, and vm.memfd_noexec=0 is there for backward compatibility
> > > > reasons. Besides, I honestly think there is little reason that MFD
> > > > needs to be non-sealable by default.  There might be few rare cases,
> > > > but the majority of apps don't need that.  On the flip side, the fact
> > > > that MFD is set up to be sealable by default is a nice bonus for an
> > > > app - it makes it easier for apps to use the sealing feature.
> > > >
> > > > What do you think ?
> > > >
> > > > Thanks
> > > > -Jeff
> > > >
> > > > [1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL
> > > > [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhomed-home.c&line=1274
> > > > [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c/?hl=558#L558
> > > > [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY66Hbw@mail.gmail.com/
> > > >
> > > >
> > > > > > Additionally, this enhances the useability of  pid namespace sysctl
> > > > > > vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will
> > > > > > add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or
> > > > > > MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD
> > > > > > to be sealable. This means, any application that does not desire this
> > > > > > behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to
> > > > > > migrate/enforce non-executable MFD. This adjustment ensures that
> > > > > > applications can anticipate that the sealable characteristic will
> > > > > > remain unmodified by vm.memfd_noexec.
> > > > > >
> > > > > > This patch was initially developed by Barnabás Pőcze, and Barnabás
> > > > > > used Debian Code Search and GitHub to try to find potential breakages
> > > > > > and could only find a single one. Dbus-broker's memfd_create() wrapper
> > > > > > is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to
> > > > > > work around it [1]. This workaround will break. Luckily, this only
> > > > > > affects the test suite, it does not affect
> > > > > > the normal operations of dbus-broker. There is a PR with a fix[2]. In
> > > > > > addition, David Rheinsberg also raised similar fix in [3]
> > > > > >
> > > > > > [1]:
> > > > > > https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> > > > > > [2]: https://github.com/bus1/dbus-broker/pull/366
> > > > > > [3]:
> > > > > > https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> > > > > >
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> > > > > > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> > > > > > Signed-off-by: Jeff Xu <jeffxu@google.com>
> > > > > > Reviewed-by: David Rheinsberg <david@readahead.eu>
> > > > >
> > > > > Looks good! Thanks!
> > > > > David
> > > >
diff mbox series

Patch

diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@  SYSCALL_DEFINE2(memfd_create,
 
 		inode->i_mode &= ~0111;
 		file_seals = memfd_file_seals_ptr(file);
-		if (file_seals) {
-			*file_seals &= ~F_SEAL_SEAL;
+		if (file_seals)
 			*file_seals |= F_SEAL_EXEC;
-		}
-	} else if (flags & MFD_ALLOW_SEALING) {
-		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
+	}
+
+	if (flags & MFD_ALLOW_SEALING) {
 		file_seals = memfd_file_seals_ptr(file);
 		if (file_seals)
 			*file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..8579a93d006b 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@  static void test_noexec_seal(void)
 			    mfd_def_size,
 			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
 	mfd_assert_mode(fd, 0666);
-	mfd_assert_has_seals(fd, F_SEAL_EXEC);
+	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
 	mfd_fail_chmod(fd, 0777);
 	close(fd);
 }
@@ -1169,6 +1169,14 @@  static void test_sysctl_sysctl0(void)
 	mfd_assert_has_seals(fd, 0);
 	mfd_assert_chmod(fd, 0644);
 	close(fd);
+
+	fd = mfd_assert_new("kern_memfd_sysctl_0_dfl",
+			    mfd_def_size,
+			    MFD_CLOEXEC);
+	mfd_assert_mode(fd, 0777);
+	mfd_assert_has_seals(fd, F_SEAL_SEAL);
+	mfd_assert_chmod(fd, 0644);
+	close(fd);
 }
 
 static void test_sysctl_set_sysctl0(void)
@@ -1206,6 +1214,14 @@  static void test_sysctl_sysctl1(void)
 	mfd_assert_has_seals(fd, F_SEAL_EXEC);
 	mfd_fail_chmod(fd, 0777);
 	close(fd);
+
+	fd = mfd_assert_new("kern_memfd_sysctl_1_noexec_nosealable",
+			    mfd_def_size,
+			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
+	mfd_assert_mode(fd, 0666);
+	mfd_assert_has_seals(fd, F_SEAL_EXEC | F_SEAL_SEAL);
+	mfd_fail_chmod(fd, 0777);
+	close(fd);
 }
 
 static void test_sysctl_set_sysctl1(void)
@@ -1238,6 +1254,14 @@  static void test_sysctl_sysctl2(void)
 	mfd_assert_has_seals(fd, F_SEAL_EXEC);
 	mfd_fail_chmod(fd, 0777);
 	close(fd);
+
+	fd = mfd_assert_new("kern_memfd_sysctl_2_noexec_notsealable",
+			    mfd_def_size,
+			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
+	mfd_assert_mode(fd, 0666);
+	mfd_assert_has_seals(fd, F_SEAL_EXEC | F_SEAL_SEAL);
+	mfd_fail_chmod(fd, 0777);
+	close(fd);
 }
 
 static void test_sysctl_set_sysctl2(void)