diff mbox series

[v7,04/10] selftests/harness: Fix interleaved scheduling leading to race conditions

Message ID 20240511171445.904356-5-mic@digikod.net
State Accepted
Commit a86f18903db9211e265cc130b61adb175b7a4c42
Headers show
Series Fix Kselftest's vfork() side effects | expand

Commit Message

Mickaël Salaün May 11, 2024, 5:14 p.m. UTC
Fix a race condition when running several FIXTURE_TEARDOWN() managing
the same resource.  This fixes a race condition in the Landlock file
system tests when creating or unmounting the same directory.

Using clone3() with CLONE_VFORK guarantees that the child and grandchild
test processes are sequentially scheduled.  This is implemented with a
new clone3_vfork() helper replacing the fork() call.

This avoids triggering this error in __wait_for_test():
  Test ended in some other way [127]

Cc: Christian Brauner <brauner@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240511171445.904356-5-mic@digikod.net
---

Changes since v2:
* Replace __attribute__((__unused__)) with inline for clone3_vfork()
  (suggested by Kees and Jakub)
---
 tools/testing/selftests/kselftest_harness.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Mark Brown May 27, 2024, 7:07 p.m. UTC | #1
On Sat, May 11, 2024 at 07:14:39PM +0200, Mickaël Salaün wrote:

> Fix a race condition when running several FIXTURE_TEARDOWN() managing
> the same resource.  This fixes a race condition in the Landlock file
> system tests when creating or unmounting the same directory.

> Using clone3() with CLONE_VFORK guarantees that the child and grandchild
> test processes are sequentially scheduled.  This is implemented with a
> new clone3_vfork() helper replacing the fork() call.

This is now in mainline and appears to be causing several tests (at
least the ptrace vmaccess global_attach test on arm64, possibly also
some of the epoll tests) that previously were timed out by the harness
to to hang instead.  A bisect seems to point at this patch in
particular, there was a bunch of discussion of the fallout of these
patches but I'm afraid I lost track of it, is there something in flight
for this?  -next is affected as well from the looks of it.

Log of the ptrace issue (not that it's useful, the job just gets killed
by the test runner):

   https://lava.sirena.org.uk/scheduler/job/307984

Bisect log:

git bisect start
# status: waiting for both good and bad commits
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
# status: waiting for bad commit, 1 good commit known
# bad: [a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6] Linux 6.9
git bisect bad a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
# good: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
git bisect good 480e035fc4c714fb5536e64ab9db04fedc89e910
# good: [2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c] Merge tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
git bisect good 2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c
# good: [e858beeddfa3a400844c0e22d2118b3b52f1ea5e] bcachefs: If we run merges at a lower watermark, they must be nonblocking
git bisect good e858beeddfa3a400844c0e22d2118b3b52f1ea5e
# good: [e43afae4a335ac0bf54c7a8f23ed65dd55449649] Merge tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good e43afae4a335ac0bf54c7a8f23ed65dd55449649
# good: [09e10499ee6a5a89fc352f25881276398a49596a] Merge tag 'drm-misc-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
git bisect good 09e10499ee6a5a89fc352f25881276398a49596a
# good: [3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc] Merge tag 'usb-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc
# good: [62788b0f225da1837ad38101112e2c49123470ee] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
git bisect good 62788b0f225da1837ad38101112e2c49123470ee
# good: [ed44935c330a2633440e8d2660db3c7538eeaf10] Merge tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
git bisect good ed44935c330a2633440e8d2660db3c7538eeaf10
# good: [c22c3e0753807feee1391a22228b0d5e6ba39b74] Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good c22c3e0753807feee1391a22228b0d5e6ba39b74
# good: [b61821bb32c5577272408e1b05e6a0879a64257f] Merge tag 'drm-misc-fixes-2024-05-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
git bisect good b61821bb32c5577272408e1b05e6a0879a64257f
# bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes
git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33
# bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes
git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33
# bad: [3656bc23429a4d539c81b5cb8f17ceeeeca8901a] selftests/landlock: Do not allocate memory in fixture data
git bisect bad 3656bc23429a4d539c81b5cb8f17ceeeeca8901a
# good: [7e4042abe2ee7c0977fd8bb049a6991b174a5e6f] selftests/landlock: Fix FS tests when run on a private mount point
git bisect good 7e4042abe2ee7c0977fd8bb049a6991b174a5e6f
# bad: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
git bisect bad a86f18903db9211e265cc130b61adb175b7a4c42
# good: [fff37bd32c7605d93bf900c4c318d56d12000048] selftests/harness: Fix fixture teardown
git bisect good fff37bd32c7605d93bf900c4c318d56d12000048
# first bad commit: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
Mark Brown June 3, 2024, 4:27 p.m. UTC | #2
On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
> On Sat, May 11, 2024 at 07:14:39PM +0200, Mickaël Salaün wrote:

> > Fix a race condition when running several FIXTURE_TEARDOWN() managing
> > the same resource.  This fixes a race condition in the Landlock file
> > system tests when creating or unmounting the same directory.
> 
> > Using clone3() with CLONE_VFORK guarantees that the child and grandchild
> > test processes are sequentially scheduled.  This is implemented with a
> > new clone3_vfork() helper replacing the fork() call.
> 
> This is now in mainline and appears to be causing several tests (at
> least the ptrace vmaccess global_attach test on arm64, possibly also
> some of the epoll tests) that previously were timed out by the harness
> to to hang instead.  A bisect seems to point at this patch in
> particular, there was a bunch of discussion of the fallout of these
> patches but I'm afraid I lost track of it, is there something in flight
> for this?  -next is affected as well from the looks of it.

FWIW I'm still seeing this on -rc2...
Mark Brown June 3, 2024, 5:22 p.m. UTC | #3
On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
> On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:

> > This is now in mainline and appears to be causing several tests (at
> > least the ptrace vmaccess global_attach test on arm64, possibly also
> > some of the epoll tests) that previously were timed out by the harness
> > to to hang instead.  A bisect seems to point at this patch in
> > particular, there was a bunch of discussion of the fallout of these
> > patches but I'm afraid I lost track of it, is there something in flight
> > for this?  -next is affected as well from the looks of it.

> FWIW I'm still seeing this on -rc2...

AFAICT this is due to the switch to using clone3() with CLONE_VFORK
to start the test which means we never even call alarm() to set up the
timeout for the test, let alone have the signal for it delivered.  I'm a
confused about how this could ever work, with clone_vfork() the parent
shouldn't run until the child execs (which won't happen here) or exits.
Since we don't call alarm() until after we started the child we never
actually get that far, but even if we reorder things we'll not get the
signal for the alarm if the child messes up since the parent is
suspended.

I'm not clear what the original race being fixed here was but it seems
like we should revert this since the timeout functionality is pretty
important?
Mickaël Salaün June 4, 2024, 4:06 p.m. UTC | #4
On Mon, Jun 03, 2024 at 06:22:32PM +0100, Mark Brown wrote:
> On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
> > On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
> 
> > > This is now in mainline and appears to be causing several tests (at
> > > least the ptrace vmaccess global_attach test on arm64, possibly also
> > > some of the epoll tests) that previously were timed out by the harness
> > > to to hang instead.  A bisect seems to point at this patch in
> > > particular, there was a bunch of discussion of the fallout of these
> > > patches but I'm afraid I lost track of it, is there something in flight
> > > for this?  -next is affected as well from the looks of it.

Thanks for the heads up.  I warned about not being able to test
everything when fixing kselftest last time, but nobody show up.  Is
there an easy way to run most kselftests?  We really need a (more
accessible) CI...

> 
> > FWIW I'm still seeing this on -rc2...
> 
> AFAICT this is due to the switch to using clone3() with CLONE_VFORK

I guess it started with the previous vfork() that was later replaced
with CLONE_VFORK.

> to start the test which means we never even call alarm() to set up the
> timeout for the test, let alone have the signal for it delivered.  I'm a
> confused about how this could ever work, with clone_vfork() the parent
> shouldn't run until the child execs (which won't happen here) or exits.
> Since we don't call alarm() until after we started the child we never
> actually get that far, but even if we reorder things we'll not get the
> signal for the alarm if the child messes up since the parent is
> suspended.
> 
> I'm not clear what the original race being fixed here was but it seems
> like we should revert this since the timeout functionality is pretty
> important?

It took me a while to fix all the previous issues and it would be much
easier to just fix this issue too.

I'm working on it.
Mark Brown June 4, 2024, 4:18 p.m. UTC | #5
On Tue, Jun 04, 2024 at 06:06:48PM +0200, Mickaël Salaün wrote:

> Thanks for the heads up.  I warned about not being able to test
> everything when fixing kselftest last time, but nobody show up.  Is
> there an easy way to run most kselftests?  We really need a (more
> accessible) CI...

You can just invoke the top level kselftests Makefile but between things
being flaky and runtime requirements there's a bunch of noise there.
KernelCI covers a bunch of it and would be my go to, I've got a good
chunk of the selftests that actually build and run reliably in my
personal CI but it has no visible UI.  Part of the issue here might be
platform specifics, I'm seeing this on arm64.  

> > > FWIW I'm still seeing this on -rc2...

> > AFAICT this is due to the switch to using clone3() with CLONE_VFORK

> I guess it started with the previous vfork() that was later replaced
> with CLONE_VFORK.

Bisect did seem to point at this commit FWIW, I've not dug into any API
differences or anything here.  The immediate thing being replaced was a
plain fork() though I see it was vfork() at some point before that, and
I'd not have noticed if the individual testcases weren't hanging so the
timeout was needed.

> > I'm not clear what the original race being fixed here was but it seems
> > like we should revert this since the timeout functionality is pretty
> > important?

> It took me a while to fix all the previous issues and it would be much
> easier to just fix this issue too.

> I'm working on it.

Great, thanks.
diff mbox series

Patch

diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index 55699a762c45..9d7178a71c2c 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -66,6 +66,8 @@ 
 #include <sys/wait.h>
 #include <unistd.h>
 #include <setjmp.h>
+#include <syscall.h>
+#include <linux/sched.h>
 
 #include "kselftest.h"
 
@@ -80,6 +82,17 @@ 
 #  define TH_LOG_ENABLED 1
 #endif
 
+/* Wait for the child process to end but without sharing memory mapping. */
+static inline pid_t clone3_vfork(void)
+{
+	struct clone_args args = {
+		.flags = CLONE_VFORK,
+		.exit_signal = SIGCHLD,
+	};
+
+	return syscall(__NR_clone3, &args, sizeof(args));
+}
+
 /**
  * TH_LOG()
  *
@@ -1183,7 +1196,7 @@  void __run_test(struct __fixture_metadata *f,
 	fflush(stdout);
 	fflush(stderr);
 
-	t->pid = fork();
+	t->pid = clone3_vfork();
 	if (t->pid < 0) {
 		ksft_print_msg("ERROR SPAWNING TEST CHILD\n");
 		t->exit_code = KSFT_FAIL;