[v2,0/2] livepatch: Move tests from lib/livepatch to selftests/livepatch

Message ID	20220630141226.2802-1-mpdesouza@suse.com
Headers	show Return-Path: <linux-kselftest-owner@kernel.org> From: Marcos Paulo de Souza <mpdesouza@suse.com> To: live-patching@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: shuah@kernel.org, jpoimboe@redhat.com, mbenes@suse.cz, pmladek@suse.com, Marcos Paulo de Souza <mpdesouza@suse.com> Subject: [PATCH v2 0/2] livepatch: Move tests from lib/livepatch to selftests/livepatch Date: Thu, 30 Jun 2022 11:12:24 -0300 Message-Id: <20220630141226.2802-1-mpdesouza@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	livepatch: Move tests from lib/livepatch to selftests/livepatch \| expand [v2,0/2] livepatch: Move tests from lib/livepatch to selftests/livepatch [v2,1/2] livepatch: Move tests from lib/livepatch to selftests/livepatch

Marcos Paulo de Souza June 30, 2022, 2:12 p.m. UTC

Hi there,

this is the v2 of the patchset. The v1 can be found at [1]. There is only one
change in patch 1, which changed the target directory to build the test modules.
All other changes happen in patch 2.

Thanks for reviewing!

Changes from v1:
# test_modules/Makefile
  * Build the test modules targeting /lib/modules, instead of ksrc when building
    from the kernel source.

# test_modules/test_klp_syscall.c
  * Added a parameter array to receive the pids that should transition to the
    new system call. (suggedted by Joe)
  * Create a new sysfs file /sys/kernel/test_klp_syscall/npids to show how many
    pids from the argument need to transition to the new state. (suggested by
    Joe)
  * Fix the PPC32 support by adding the syscall wrapper for archs that select it
    by default, without erroring out. PPC does not set SYSCALL_WRAPPER, so
    having it set in v1 was a mistake. (suggested by Joe)
  * The aarch64 syscall prefix was added too, since the livepatch support will come soon.

# test_binaries/test_klp-call_getpid.c
  * Change %d/%u in printf (suggested byu Joe)
  * Change run -> stop variable name, and inverted the assignments (suggested by
  * Joe).

# File test-syscall.sh
  * Fixed test-syscall.sh to call test_klp-call-getpid in test_binaries dir
  * Load test_klp_syscall passed the pids of the test_klp-call_getpid instances.
    Check the sysfs file from test_klp_syscall module to check that all pids
    transitioned correctly. (suggested by Joe)
  * Simplified the loop that calls test_klp-call_getpid. (suggested by Joe)
  * Removed the "success" comment from the script, as it's implicit that it
    succeed. Otherwise load_lp would error out. (suggested by Joe)

* Changed the commit message of patch 2 to further detail what means "tricky"
  when livepatching syscalls. (suggested by Joe)

[1]: 20220603143242.870-1-mpdesouza@suse.com

Marcos Paulo de Souza (2):
  livepatch: Move tests from lib/livepatch to selftests/livepatch
  selftests: livepatch: Test livepatching a heavily called syscall

 arch/s390/configs/debug_defconfig             |   1 -
 arch/s390/configs/defconfig                   |   1 -
 lib/Kconfig.debug                             |  22 ---
 lib/Makefile                                  |   2 -
 lib/livepatch/Makefile                        |  14 --
 tools/testing/selftests/livepatch/Makefile    |  35 +++-
 tools/testing/selftests/livepatch/README      |   5 +-
 tools/testing/selftests/livepatch/config      |   1 -
 .../testing/selftests/livepatch/functions.sh  |  34 ++--
 .../selftests/livepatch/test-callbacks.sh     |  50 +++---
 .../selftests/livepatch/test-ftrace.sh        |   6 +-
 .../selftests/livepatch/test-livepatch.sh     |  10 +-
 .../selftests/livepatch/test-shadow-vars.sh   |   2 +-
 .../testing/selftests/livepatch/test-state.sh |  18 +--
 .../selftests/livepatch/test-syscall.sh       |  52 ++++++
 .../test_binaries/test_klp-call_getpid.c      |  48 ++++++
 .../selftests/livepatch/test_modules/Makefile |  20 +++
 .../test_modules}/test_klp_atomic_replace.c   |   0
 .../test_modules}/test_klp_callbacks_busy.c   |   0
 .../test_modules}/test_klp_callbacks_demo.c   |   0
 .../test_modules}/test_klp_callbacks_demo2.c  |   0
 .../test_modules}/test_klp_callbacks_mod.c    |   0
 .../test_modules}/test_klp_livepatch.c        |   0
 .../test_modules}/test_klp_shadow_vars.c      |   0
 .../livepatch/test_modules}/test_klp_state.c  |   0
 .../livepatch/test_modules}/test_klp_state2.c |   0
 .../livepatch/test_modules}/test_klp_state3.c |   0
 .../livepatch/test_modules/test_klp_syscall.c | 150 ++++++++++++++++++
 28 files changed, 360 insertions(+), 111 deletions(-)
 delete mode 100644 lib/livepatch/Makefile
 create mode 100755 tools/testing/selftests/livepatch/test-syscall.sh
 create mode 100644 tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
 create mode 100644 tools/testing/selftests/livepatch/test_modules/Makefile
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_atomic_replace.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_callbacks_busy.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_callbacks_demo.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_callbacks_demo2.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_callbacks_mod.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_livepatch.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_shadow_vars.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_state.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_state2.c (100%)
 rename {lib/livepatch => tools/testing/selftests/livepatch/test_modules}/test_klp_state3.c (100%)
 create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c

Joe Lawrence July 12, 2022, 2:56 p.m. UTC | #1

On Thu, Jun 30, 2022 at 11:12:26AM -0300, Marcos Paulo de Souza wrote:
> Syscalls are called a tricky way. Some architectures add a prefix to the
> syscall name (SYSCALL_WRAPPER).
> 
> This new test creates one userspace process per online cpu calling getpid
> continuously and tries to livepatch the getpid function. Add the correct
> function prefix for all archs that select HAS_SYSCALL_WRAPPER.
> 
> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>

Hi Marcos,

Thanks for working on this and sorry for the delayed reply.  I gave it a
spin yesterday and have a few comments below...

> ---
>  tools/testing/selftests/livepatch/Makefile    |  12 +-
>  .../selftests/livepatch/test-syscall.sh       |  52 ++++++
>  .../test_binaries/test_klp-call_getpid.c      |  48 ++++++
>  .../selftests/livepatch/test_modules/Makefile |   3 +-
>  .../livepatch/test_modules/test_klp_syscall.c | 150 ++++++++++++++++++
>  5 files changed, 261 insertions(+), 4 deletions(-)
>  create mode 100755 tools/testing/selftests/livepatch/test-syscall.sh
>  create mode 100644 tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
>  create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> 
> diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile
> index 5ef492b87bb1..35014197184e 100644
> --- a/tools/testing/selftests/livepatch/Makefile
> +++ b/tools/testing/selftests/livepatch/Makefile
> @@ -1,10 +1,14 @@
>  # SPDX-License-Identifier: GPL-2.0
> +include ../../../build/Build.include
> +include ../../../scripts/Makefile.arch
> +include ../../../scripts/Makefile.include
>  
>  TEST_FILES := settings \
>  		test_modules
>  
>  # We need the test_modules dir in order to make gen_tar and install to work
> -TEST_GEN_PROGS_EXTENDED := test_modules/test_klp_atomic_replace.ko \
> +TEST_GEN_PROGS_EXTENDED := test_binaries/test_klp-call_getpid \

Before I commit this to muscle memory, would "test_programs" better
describe this directory?  When I see "binaries", I think of source vs.
build and not kernel vs. userspace.

> +			test_modules/test_klp_atomic_replace.ko \
>  			test_modules/test_klp_callbacks_busy.ko \
>  			test_modules/test_klp_callbacks_demo.ko \
>  			test_modules/test_klp_callbacks_demo2.ko \
> @@ -13,7 +17,8 @@ TEST_GEN_PROGS_EXTENDED := test_modules/test_klp_atomic_replace.ko \
>  			test_modules/test_klp_state.ko \
>  			test_modules/test_klp_state2.ko \
>  			test_modules/test_klp_state3.ko \
> -			test_modules/test_klp_shadow_vars.ko
> +			test_modules/test_klp_shadow_vars.ko \
> +			test_modules/test_klp_syscall.ko
>  
>  TEST_PROGS_EXTENDED := functions.sh
>  TEST_PROGS := \
> @@ -21,7 +26,8 @@ TEST_PROGS := \
>  	test-callbacks.sh \
>  	test-shadow-vars.sh \
>  	test-state.sh \
> -	test-ftrace.sh
> +	test-ftrace.sh \
> +	test-syscall.sh
>  
>  # override lib.mk's default rules
>  OVERRIDE_TARGETS := 1
> diff --git a/tools/testing/selftests/livepatch/test-syscall.sh b/tools/testing/selftests/livepatch/test-syscall.sh
> new file mode 100755
> index 000000000000..fb3270de7f1f
> --- /dev/null
> +++ b/tools/testing/selftests/livepatch/test-syscall.sh
> @@ -0,0 +1,52 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2022 SUSE
> +# Author: Marcos Paulo de Souza <mpdesouza@suse.com>
> +
> +. $(dirname $0)/functions.sh
> +
> +MOD_SYSCALL=test_klp_syscall
> +
> +setup_config
> +
> +# - Start _NRPROC processes calling getpid and load a livepatch to patch the
> +#   getpid syscall
> +
> +start_test "patch getpid syscall while being heavily hammered"
> +
> +for i in $(seq 0 $(( $(getconf _NPROCESSORS_ONLN) -1)) ); do
> +	./test_binaries/test_klp-call_getpid &
> +	pids[$i]="$!"
> +done

On second thought, bash should only iterate through the elements that
are defined, so your earlier version is probably better:

  for i in $(seq 1 $(getconf _NPROCESSORS_ONLN)); do

> +
> +pid_list=$(echo ${pids[@]} | tr ' ' ',')
> +load_lp $MOD_SYSCALL klp_pids=$pid_list
> +

We could wait for all test_klp-call_getpid instances to execute the
livepatch code:

  loop_until 'grep -q '^0$' /sys/kernel/test_klp_syscall/npids'

This has a built in timeout, so it's not infinitely looping.  You
could add some die "reason" logic here, or just let the existing code
cleanup after the failed test.

> +pending_pids=$(cat /sys/kernel/test_klp_syscall/npids)
> +
> +for pid in ${pids[@]}; do
> +	kill $pid || true
> +done
> +
> +disable_lp $MOD_SYSCALL
> +unload_lp $MOD_SYSCALL
> +
> +if [[ "$pending_pids" != "0" ]]; then
> +	echo -e "FAIL\n\n"
> +	die "processes not livepatched: $pending_pids"
> +fi
> +
> +check_result "% insmod test_modules/$MOD_SYSCALL.ko klp_pids=$pid_list
> +livepatch: enabling patch '$MOD_SYSCALL'
> +livepatch: '$MOD_SYSCALL': initializing patching transition
> +livepatch: '$MOD_SYSCALL': starting patching transition
> +livepatch: '$MOD_SYSCALL': completing patching transition
> +livepatch: '$MOD_SYSCALL': patching complete
> +% echo 0 > /sys/kernel/livepatch/$MOD_SYSCALL/enabled
> +livepatch: '$MOD_SYSCALL': initializing unpatching transition
> +livepatch: '$MOD_SYSCALL': starting unpatching transition
> +livepatch: '$MOD_SYSCALL': completing unpatching transition
> +livepatch: '$MOD_SYSCALL': unpatching complete
> +% rmmod $MOD_SYSCALL"
> +
> +exit 0
> diff --git a/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c b/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> new file mode 100644
> index 000000000000..ec470660dee6
> --- /dev/null
> +++ b/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> @@ -0,0 +1,48 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2017 SUSE
> + * Author: Libor Pechacek <lpechacek@suse.cz>
> + */
> +
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/syscall.h>
> +#include <sys/types.h>
> +#include <signal.h>
> +
> +static int stop = 0;

nit: no need to init global to 0

> +static int sig_int;
> +
> +void hup_handler(int signum)
> +{
> +	stop = 1;
> +}
> +
> +void int_handler(int signum)
> +{
> +	stop = 1;
> +	sig_int = 1;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	pid_t orig_pid, pid;
> +	long count = 0;
> +
> +	signal(SIGHUP, &hup_handler);
> +	signal(SIGINT, &int_handler);
> +
> +	orig_pid = syscall(SYS_getpid);
> +
> +	while (!stop) {
> +		pid = syscall(SYS_getpid);
> +		if (pid != orig_pid)
> +			return 1;

This test doesn't care about the user program return code, but I wonder
if the status should be flipped -- this is the desired code path, not
the one at the end of main(), right?

> +		count++;
> +	}
> +
> +	if (sig_int)
> +		printf("%ld iterations done\n", count);
> +
> +	return 0;
> +}
> diff --git a/tools/testing/selftests/livepatch/test_modules/Makefile b/tools/testing/selftests/livepatch/test_modules/Makefile
> index 1eab43b741cd..ebb754accf46 100644
> --- a/tools/testing/selftests/livepatch/test_modules/Makefile
> +++ b/tools/testing/selftests/livepatch/test_modules/Makefile
> @@ -10,7 +10,8 @@ obj-m += test_klp_atomic_replace.o \
>  	test_klp_state.o \
>  	test_klp_state2.o \
>  	test_klp_state3.o \
> -	test_klp_shadow_vars.o
> +	test_klp_shadow_vars.o \
> +	test_klp_syscall.o
>  
>  %.ko:
>  	make -C $(KDIR) M=$(TESTMODS_DIR) $@
> diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> new file mode 100644
> index 000000000000..e90f4ac8e7a4
> --- /dev/null
> +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> @@ -0,0 +1,150 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2017-2022 SUSE
> + * Authors: Libor Pechacek <lpechacek@suse.cz>
> + *          Nicolai Stange <nstange@suse.de>
> + *          Marcos Paulo de Souza <mpdesouza@suse.com>
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +#include <linux/livepatch.h>
> +
> +#if defined(__x86_64__)
> +#define FN_PREFIX __x64_
> +#elif defined(__s390x__)
> +#define FN_PREFIX __s390x_
> +#elif defined(__aarch64__)
> +#define FN_PREFIX __arm64_
> +#else
> +/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER */
> +#define FN_PREFIX
> +#endif
> +
> +struct klp_pid_t {
> +	pid_t pid;
> +	struct list_head list;
> +};
> +static LIST_HEAD(klp_pid_list);
> +
> +/* Protects klp_pid_list */
> +static DEFINE_MUTEX(kpid_mutex);
> +
> +static int klp_pids[NR_CPUS];
> +static unsigned int npids;
> +module_param_array(klp_pids, int, &npids, 0);
> +
> +static ssize_t npids_show(struct kobject *kobj, struct kobj_attribute *attr,
> +			  char *buf)
> +{
> +	return sprintf(buf, "%u\n", npids);
> +}
> +
> +static struct kobj_attribute klp_attr = __ATTR_RO(npids);
> +static struct kobject *klp_kobj;
> +
> +static void free_klp_pid_list(void)
> +{
> +	struct klp_pid_t *kpid, *temp;
> +
> +	mutex_lock(&kpid_mutex);
> +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> +		list_del(&kpid->list);
> +		kfree(kpid);
> +	}
> +	mutex_unlock(&kpid_mutex);
> +}
> +
> +asmlinkage long lp_sys_getpid(void)
> +{
> +	struct klp_pid_t *kpid, *temp;
> +
> +	/*
> +	 * For each thread calling getpid, check if the pid exists in
> +	 * klp_pid_list. If yes, decrement the npids variable and remove the pid
> +	 * from the list. npids will be later used to ensure that all pids
> +	 * transitioned to the liveaptched state.

I go back and forth with this comment: 1) the code is pretty
self-explanatory and 2) it explains what test-syscall.sh is doing with
npids

Describing the behavioral difference I suggest below would be helpful
(if you agree w/the change).  Something about removing the current pid
from the list, okay, too.

Then as far as describing npids' purpose, I'd put that up at the top of
the file where npids is defined.  It can be as generic as describing
what it's counting.

> +	 */
> +	mutex_lock(&kpid_mutex);
> +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> +		if (current->pid == kpid->pid) {
> +			list_del(&kpid->list);
> +			kfree(kpid);
> +			npids--;
> +			break;

I think it would be safer to return task_tgid_vnr() here, but ...

> +		}
> +	}
> +	mutex_unlock(&kpid_mutex);
> +
> +	return task_tgid_vnr(current);

task_pid_vnr() here.  That way we're only changing behavior for the
processes in the list and not all programs across the system.

> +}
> +
> +static struct klp_func vmlinux_funcs[] = {
> +	{
> +		.old_name = __stringify(FN_PREFIX) "sys_getpid",
> +		.new_func = lp_sys_getpid,
> +	}, {}
> +};
> +
> +static struct klp_object objs[] = {
> +	{
> +		/* name being NULL means vmlinux */
> +		.funcs = vmlinux_funcs,
> +	}, {}
> +};
> +
> +static struct klp_patch patch = {
> +	.mod = THIS_MODULE,
> +	.objs = objs,
> +};
> +
> +static int livepatch_init(void)
> +{
> +	int ret;
> +	struct klp_pid_t *kpid;
> +
> +	if (npids > 0) {
> +		int i;
> +
> +		for (i = 0; i < npids; i++) {
> +			kpid = kmalloc(sizeof(struct klp_pid_t), GFP_KERNEL);
> +			if (!kpid)
> +				goto err_mem;
> +
> +			kpid->pid = klp_pids[i];
> +			list_add(&kpid->list, &klp_pid_list);
> +		}
> +	}
> +
> +	klp_kobj = kobject_create_and_add("test_klp_syscall", kernel_kobj);
> +	if (!klp_kobj)
> +		goto err_mem;
> +
> +	ret = sysfs_create_file(klp_kobj, &klp_attr.attr);
> +	if (ret) {
> +		kobject_put(klp_kobj);
> +		goto out_klp_pid_list;
> +	}
> +
> +	return klp_enable_patch(&patch);
> +
> +err_mem:
> +	ret = -ENOMEM;
> +out_klp_pid_list:
> +	free_klp_pid_list();
> +
> +	return ret;
> +}
> +
> +static void livepatch_exit(void)
> +{
> +	free_klp_pid_list();
> +	kobject_put(klp_kobj);
> +}
> +
> +module_init(livepatch_init);
> +module_exit(livepatch_exit);
> +MODULE_LICENSE("GPL");
> +MODULE_INFO(livepatch, "Y");

The linked list code looks like it should work and cleans up properly,
but if the current test is the extent of what we need, it would be
simpler to directly use klp_pids[] to track progress: since we're never
going to livepatch pid 0, we could just search klp_pids[] for our pid
and then zero it out once we've executed the livepatch code.

This would let us drop out the kmalloc and list management code (looks
to be nearly 25% of the file at this point).  I think the kpid_mutex
still needs to stay though.  Iterating through the fixed array shouldn't
be too much more expensive than through the (shrinking) klp_pid_list.

On the other hand, if you think that we'll need to adapt this module to
handle a dynamically changing list of klp_pids[], then perhaps a
separate list is the way to go.

Final note, kbuild is probably my weakest area, so I'm of limited help
review-wise there.  In trying out the code, I wasn't sure exactly how to
run things.  I did:

  $ make -C tools/testing/selftests/livepatch/ run_tests

and that built everything and ran the tests (good).  I changed a few .c
files and tried again, but they didn't rebuild.  Then from
tools/testing/selftests/livepatch/test_modules/ I ran `make` and it only
ran the clean target (not good, I think).

What is the typical workflow supposed to be in this combined setup?

Thanks,

--
Joe

Petr Mladek July 15, 2022, 2:45 p.m. UTC | #2

On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
> On 7/1/22 1:48 AM, Miroslav Benes wrote:
> > On Thu, 30 Jun 2022, Shuah Khan wrote:
> > > 
> > > Sorry Nack on this. Let's not add modules under selftests. Any usage of
> > > module_init()
> > > doesn't belong under selftests.
> 
> Yes I did and after reviewing and thinking about it some more, I decided this
> is the right direction go down on.

Do you have some particular reason why building modules in selftests
directory might cause problems, please?

IMHO, the reason that the test modules are in lib is because the
modules were there before selftests. Developers historically loaded them
manually or they were built-in. Selftest were added later and are just
another way how the module can be loaded. This is the case,
for example, for lib/test_printf.c.

Otherwise, I do not see any big difference between building binaries
and modules under tools/tests/selftests. As I said, in the older
thread, IMHO, it makes more sense to have the selftest sources
self-contained.

There actually seems to be a principal problem in the following use
case:

--- cut Documentation/dev-tools/kselftest.rst ---
Kselftest from mainline can be run on older stable kernels. Running tests
from mainline offers the best coverage. Several test rings run mainline
kselftest suite on stable releases. The reason is that when a new test
gets added to test existing code to regression test a bug, we should be
able to run that test on an older kernel. Hence, it is important to keep
code that can still test an older kernel and make sure it skips the test
gracefully on newer releases.
--- cut Documentation/dev-tools/kselftest.rst ---

together with

--- cut Documentation/dev-tools/kselftest.rst ---
 * First use the headers inside the kernel source and/or git repo, and then the
   system headers.  Headers for the kernel release as opposed to headers
   installed by the distro on the system should be the primary focus to be able
   to find regressions.
--- cut Documentation/dev-tools/kselftest.rst ---

It means that selftests should support running binaries built against
newer kernel sources on system running older kernel. But this might
be pretty hard to achieve and maintain.

The normal kernel rules are exactly the opposite. Old binaries must
be able to run on newer kernels. The old binaries were built against
older headers.

IMHO, the testing of stable kernels makes perfect sense. And if we
want to support it seriously than we need to allow building new
selftests against headers from the old to-be-tested kernel. And
it will be possible only when the selftests sources are as much
selfcontained as possible.

Does this makes any sense, please?

Best Regards,
Petr

Petr Mladek July 29, 2022, 1:19 p.m. UTC | #3

On Tue 2022-07-12 10:56:11, Joe Lawrence wrote:
> On Thu, Jun 30, 2022 at 11:12:26AM -0300, Marcos Paulo de Souza wrote:
> > Syscalls are called a tricky way. Some architectures add a prefix to the
> > syscall name (SYSCALL_WRAPPER).
> > 
> > This new test creates one userspace process per online cpu calling getpid
> > continuously and tries to livepatch the getpid function. Add the correct
> > function prefix for all archs that select HAS_SYSCALL_WRAPPER.
> > 
> > Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> 
> Hi Marcos,
> 
> Thanks for working on this and sorry for the delayed reply.  I gave it a
> spin yesterday and have a few comments below...
> 
> > diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile
> > index 5ef492b87bb1..35014197184e 100644
> > --- a/tools/testing/selftests/livepatch/Makefile
> > +++ b/tools/testing/selftests/livepatch/Makefile
> > @@ -1,10 +1,14 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > +include ../../../build/Build.include
> > +include ../../../scripts/Makefile.arch
> > +include ../../../scripts/Makefile.include

This looks bad. It is outside of tools/tests/selftests/. It would
mean that the tests are not self-contained any longer.

The tests should support all scenarios described in
Documentation/dev-tools/kselftest.rst.

Note that it allows to install them and run them from
the installed copy. I guess that this is useful if you
want to package the selftests.

> Final note, kbuild is probably my weakest area, so I'm of limited help
> review-wise there.  In trying out the code, I wasn't sure exactly how to
> run things.  I did:
> 
>   $ make -C tools/testing/selftests/livepatch/ run_tests
> 
> and that built everything and ran the tests (good).  I changed a few .c
> files and tried again, but they didn't rebuild.  Then from
> tools/testing/selftests/livepatch/test_modules/ I ran `make` and it only
> ran the clean target (not good, I think).
>
> What is the typical workflow supposed to be in this combined setup?

I personally test the mainline kernels the following way:

workstation> make binrpm-pkg

test-system-qemu# sshfs workstation:/path/to/packages /mnt
test-system-qemu# rpm -ivh /mnt/RPMS/x86_64/kernel-xxxx.rpm
test-system-qemu# reboot

test-system-qemu# sshfs workstation:/path/to/sources /mnt
test-system-qemu# cd /mnt/linux/tools/tests/selftests/livepatch
test-system-qemu# make run_tests

With this patchset I get:

test-system-qemu:/mnt/kernel/linux/tools/testing/selftests/livepatch # make run_tests 
make -C test_modules test_klp_atomic_replace.ko
make[1]: Entering directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make -C /lib/modules/5.19.0-rc8-default+/build M=/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules test_klp_atomic_replace.ko
make[2]: Entering directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make[2]: *** /lib/modules/5.19.0-rc8-default+/build: No such file or directory.  Stop.
make[2]: Leaving directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make[1]: *** [Makefile:17: test_klp_atomic_replace.ko] Error 2
make[1]: Leaving directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make: *** [Makefile:44: /mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules/test_klp_atomic_replace.ko] Error 2

I get the same when I try it from the top level directory:

test-system-qemu:/mnt/kernel/linux # make -C tools/testing/selftests TARGETS=livepatch run_tests
make: Entering directory '/mnt/kernel/linux/tools/testing/selftests'
make --no-builtin-rules ARCH=x86 -C ../../.. headers_install
make[1]: Entering directory '/mnt/kernel/linux'
  INSTALL ./usr/include
make[1]: Leaving directory '/mnt/kernel/linux'
make[1]: Entering directory '/mnt/kernel/linux/tools/testing/selftests/livepatch'
make -C test_modules test_klp_atomic_replace.ko
make[2]: Entering directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make -C /lib/modules/5.19.0-rc8-default+/build M=/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules test_klp_atomic_replace.ko
make[3]: Entering directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make[3]: *** /lib/modules/5.19.0-rc8-default+/build: No such file or directory.  Stop.
make[3]: Leaving directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make[2]: *** [Makefile:17: test_klp_atomic_replace.ko] Error 2
make[2]: Leaving directory '/mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules'
make[1]: *** [Makefile:44: /mnt/kernel/linux/tools/testing/selftests/livepatch/test_modules/test_klp_atomic_replace.ko] Error 2
make[1]: Leaving directory '/mnt/kernel/linux/tools/testing/selftests/livepatch'
make: *** [Makefile:178: all] Error 2
make: Leaving directory '/mnt/kernel/linux/tools/testing/selftests'

I am not sure if this is exactly the process described in
Documentation/dev-tools/kselftest.rst. But it looks to me that
it is the way how to test "any" kernel with selftests
from an up-to-date kernel sources.

Best Regards,
Petr

Marcos Paulo de Souza Nov. 23, 2022, 1:35 p.m. UTC | #4

On Tue, Jul 12, 2022 at 10:56:11AM -0400, Joe Lawrence wrote:
> On Thu, Jun 30, 2022 at 11:12:26AM -0300, Marcos Paulo de Souza wrote:
> > Syscalls are called a tricky way. Some architectures add a prefix to the
> > syscall name (SYSCALL_WRAPPER).
> > 
> > This new test creates one userspace process per online cpu calling getpid
> > continuously and tries to livepatch the getpid function. Add the correct
> > function prefix for all archs that select HAS_SYSCALL_WRAPPER.
> > 
> > Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> 
> Hi Marcos,
> 
> Thanks for working on this and sorry for the delayed reply.  I gave it a
> spin yesterday and have a few comments below...

It has been sometime since the last version, thanks for the comments. I have
some questions bellow.

> 
> > ---
> >  tools/testing/selftests/livepatch/Makefile    |  12 +-
> >  .../selftests/livepatch/test-syscall.sh       |  52 ++++++
> >  .../test_binaries/test_klp-call_getpid.c      |  48 ++++++
> >  .../selftests/livepatch/test_modules/Makefile |   3 +-
> >  .../livepatch/test_modules/test_klp_syscall.c | 150 ++++++++++++++++++
> >  5 files changed, 261 insertions(+), 4 deletions(-)
> >  create mode 100755 tools/testing/selftests/livepatch/test-syscall.sh
> >  create mode 100644 tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> >  create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > 
> > diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile
> > index 5ef492b87bb1..35014197184e 100644
> > --- a/tools/testing/selftests/livepatch/Makefile
> > +++ b/tools/testing/selftests/livepatch/Makefile
> > @@ -1,10 +1,14 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > +include ../../../build/Build.include
> > +include ../../../scripts/Makefile.arch
> > +include ../../../scripts/Makefile.include
> >  
> >  TEST_FILES := settings \
> >  		test_modules
> >  
> >  # We need the test_modules dir in order to make gen_tar and install to work
> > -TEST_GEN_PROGS_EXTENDED := test_modules/test_klp_atomic_replace.ko \
> > +TEST_GEN_PROGS_EXTENDED := test_binaries/test_klp-call_getpid \
> 
> Before I commit this to muscle memory, would "test_programs" better
> describe this directory?  When I see "binaries", I think of source vs.
> build and not kernel vs. userspace.

Done locally.

> 
> > +			test_modules/test_klp_atomic_replace.ko \
> >  			test_modules/test_klp_callbacks_busy.ko \
> >  			test_modules/test_klp_callbacks_demo.ko \
> >  			test_modules/test_klp_callbacks_demo2.ko \
> > @@ -13,7 +17,8 @@ TEST_GEN_PROGS_EXTENDED := test_modules/test_klp_atomic_replace.ko \
> >  			test_modules/test_klp_state.ko \
> >  			test_modules/test_klp_state2.ko \
> >  			test_modules/test_klp_state3.ko \
> > -			test_modules/test_klp_shadow_vars.ko
> > +			test_modules/test_klp_shadow_vars.ko \
> > +			test_modules/test_klp_syscall.ko
> >  
> >  TEST_PROGS_EXTENDED := functions.sh
> >  TEST_PROGS := \
> > @@ -21,7 +26,8 @@ TEST_PROGS := \
> >  	test-callbacks.sh \
> >  	test-shadow-vars.sh \
> >  	test-state.sh \
> > -	test-ftrace.sh
> > +	test-ftrace.sh \
> > +	test-syscall.sh
> >  
> >  # override lib.mk's default rules
> >  OVERRIDE_TARGETS := 1
> > diff --git a/tools/testing/selftests/livepatch/test-syscall.sh b/tools/testing/selftests/livepatch/test-syscall.sh
> > new file mode 100755
> > index 000000000000..fb3270de7f1f
> > --- /dev/null
> > +++ b/tools/testing/selftests/livepatch/test-syscall.sh
> > @@ -0,0 +1,52 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (C) 2022 SUSE
> > +# Author: Marcos Paulo de Souza <mpdesouza@suse.com>
> > +
> > +. $(dirname $0)/functions.sh
> > +
> > +MOD_SYSCALL=test_klp_syscall
> > +
> > +setup_config
> > +
> > +# - Start _NRPROC processes calling getpid and load a livepatch to patch the
> > +#   getpid syscall
> > +
> > +start_test "patch getpid syscall while being heavily hammered"
> > +
> > +for i in $(seq 0 $(( $(getconf _NPROCESSORS_ONLN) -1)) ); do
> > +	./test_binaries/test_klp-call_getpid &
> > +	pids[$i]="$!"
> > +done
> 
> On second thought, bash should only iterate through the elements that
> are defined, so your earlier version is probably better:
> 
>   for i in $(seq 1 $(getconf _NPROCESSORS_ONLN)); do
> 
> > +
> > +pid_list=$(echo ${pids[@]} | tr ' ' ',')
> > +load_lp $MOD_SYSCALL klp_pids=$pid_list
> > +
> 
> We could wait for all test_klp-call_getpid instances to execute the
> livepatch code:
> 
>   loop_until 'grep -q '^0$' /sys/kernel/test_klp_syscall/npids'
> 
> This has a built in timeout, so it's not infinitely looping.  You
> could add some die "reason" logic here, or just let the existing code
> cleanup after the failed test.

Makes sense. Done locally.

> 
> > +pending_pids=$(cat /sys/kernel/test_klp_syscall/npids)
> > +
> > +for pid in ${pids[@]}; do
> > +	kill $pid || true
> > +done
> > +
> > +disable_lp $MOD_SYSCALL
> > +unload_lp $MOD_SYSCALL
> > +
> > +if [[ "$pending_pids" != "0" ]]; then
> > +	echo -e "FAIL\n\n"
> > +	die "processes not livepatched: $pending_pids"
> > +fi
> > +
> > +check_result "% insmod test_modules/$MOD_SYSCALL.ko klp_pids=$pid_list
> > +livepatch: enabling patch '$MOD_SYSCALL'
> > +livepatch: '$MOD_SYSCALL': initializing patching transition
> > +livepatch: '$MOD_SYSCALL': starting patching transition
> > +livepatch: '$MOD_SYSCALL': completing patching transition
> > +livepatch: '$MOD_SYSCALL': patching complete
> > +% echo 0 > /sys/kernel/livepatch/$MOD_SYSCALL/enabled
> > +livepatch: '$MOD_SYSCALL': initializing unpatching transition
> > +livepatch: '$MOD_SYSCALL': starting unpatching transition
> > +livepatch: '$MOD_SYSCALL': completing unpatching transition
> > +livepatch: '$MOD_SYSCALL': unpatching complete
> > +% rmmod $MOD_SYSCALL"
> > +
> > +exit 0
> > diff --git a/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c b/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> > new file mode 100644
> > index 000000000000..ec470660dee6
> > --- /dev/null
> > +++ b/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> > @@ -0,0 +1,48 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2017 SUSE
> > + * Author: Libor Pechacek <lpechacek@suse.cz>
> > + */
> > +
> > +#include <stdio.h>
> > +#include <unistd.h>
> > +#include <sys/syscall.h>
> > +#include <sys/types.h>
> > +#include <signal.h>
> > +
> > +static int stop = 0;
> 
> nit: no need to init global to 0

Indeed. Fixed.

> 
> > +static int sig_int;
> > +
> > +void hup_handler(int signum)
> > +{
> > +	stop = 1;
> > +}
> > +
> > +void int_handler(int signum)
> > +{
> > +	stop = 1;
> > +	sig_int = 1;
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +	pid_t orig_pid, pid;
> > +	long count = 0;
> > +
> > +	signal(SIGHUP, &hup_handler);
> > +	signal(SIGINT, &int_handler);
> > +
> > +	orig_pid = syscall(SYS_getpid);
> > +
> > +	while (!stop) {
> > +		pid = syscall(SYS_getpid);
> > +		if (pid != orig_pid)
> > +			return 1;
> 
> This test doesn't care about the user program return code, but I wonder
> if the status should be flipped -- this is the desired code path, not
> the one at the end of main(), right?

I got the code from our tests. IIUC, syscall with SYS_getpid cannot fail. I'll
double check before sending v3.

> 
> > +		count++;
> > +	}
> > +
> > +	if (sig_int)
> > +		printf("%ld iterations done\n", count);
> > +
> > +	return 0;
> > +}
> > diff --git a/tools/testing/selftests/livepatch/test_modules/Makefile b/tools/testing/selftests/livepatch/test_modules/Makefile
> > index 1eab43b741cd..ebb754accf46 100644
> > --- a/tools/testing/selftests/livepatch/test_modules/Makefile
> > +++ b/tools/testing/selftests/livepatch/test_modules/Makefile
> > @@ -10,7 +10,8 @@ obj-m += test_klp_atomic_replace.o \
> >  	test_klp_state.o \
> >  	test_klp_state2.o \
> >  	test_klp_state3.o \
> > -	test_klp_shadow_vars.o
> > +	test_klp_shadow_vars.o \
> > +	test_klp_syscall.o
> >  
> >  %.ko:
> >  	make -C $(KDIR) M=$(TESTMODS_DIR) $@
> > diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > new file mode 100644
> > index 000000000000..e90f4ac8e7a4
> > --- /dev/null
> > +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > @@ -0,0 +1,150 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2017-2022 SUSE
> > + * Authors: Libor Pechacek <lpechacek@suse.cz>
> > + *          Nicolai Stange <nstange@suse.de>
> > + *          Marcos Paulo de Souza <mpdesouza@suse.com>
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/kernel.h>
> > +#include <linux/sched.h>
> > +#include <linux/slab.h>
> > +#include <linux/livepatch.h>
> > +
> > +#if defined(__x86_64__)
> > +#define FN_PREFIX __x64_
> > +#elif defined(__s390x__)
> > +#define FN_PREFIX __s390x_
> > +#elif defined(__aarch64__)
> > +#define FN_PREFIX __arm64_
> > +#else
> > +/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER */
> > +#define FN_PREFIX
> > +#endif
> > +
> > +struct klp_pid_t {
> > +	pid_t pid;
> > +	struct list_head list;
> > +};
> > +static LIST_HEAD(klp_pid_list);
> > +
> > +/* Protects klp_pid_list */
> > +static DEFINE_MUTEX(kpid_mutex);
> > +
> > +static int klp_pids[NR_CPUS];
> > +static unsigned int npids;
> > +module_param_array(klp_pids, int, &npids, 0);
> > +
> > +static ssize_t npids_show(struct kobject *kobj, struct kobj_attribute *attr,
> > +			  char *buf)
> > +{
> > +	return sprintf(buf, "%u\n", npids);
> > +}
> > +
> > +static struct kobj_attribute klp_attr = __ATTR_RO(npids);
> > +static struct kobject *klp_kobj;
> > +
> > +static void free_klp_pid_list(void)
> > +{
> > +	struct klp_pid_t *kpid, *temp;
> > +
> > +	mutex_lock(&kpid_mutex);
> > +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> > +		list_del(&kpid->list);
> > +		kfree(kpid);
> > +	}
> > +	mutex_unlock(&kpid_mutex);
> > +}
> > +
> > +asmlinkage long lp_sys_getpid(void)
> > +{
> > +	struct klp_pid_t *kpid, *temp;
> > +
> > +	/*
> > +	 * For each thread calling getpid, check if the pid exists in
> > +	 * klp_pid_list. If yes, decrement the npids variable and remove the pid
> > +	 * from the list. npids will be later used to ensure that all pids
> > +	 * transitioned to the liveaptched state.
> 
> I go back and forth with this comment: 1) the code is pretty
> self-explanatory and 2) it explains what test-syscall.sh is doing with
> npids
> 
> Describing the behavioral difference I suggest below would be helpful
> (if you agree w/the change).  Something about removing the current pid
> from the list, okay, too.

Makes sense. I'll move the comment to the test.

> 
> Then as far as describing npids' purpose, I'd put that up at the top of
> the file where npids is defined.  It can be as generic as describing
> what it's counting.

Thanks for the suggestion. A short description will be added when defining the
module argument.

> 
> > +	 */
> > +	mutex_lock(&kpid_mutex);
> > +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> > +		if (current->pid == kpid->pid) {
> > +			list_del(&kpid->list);
> > +			kfree(kpid);
> > +			npids--;
> > +			break;
> 
> I think it would be safer to return task_tgid_vnr() here, but ...
> 
> > +		}
> > +	}
> > +	mutex_unlock(&kpid_mutex);
> > +
> > +	return task_tgid_vnr(current);
> 
> task_pid_vnr() here.  That way we're only changing behavior for the
> processes in the list and not all programs across the system.

So, looking at getpid syscall definition, it calls task_tgid_vnr:

SYSCALL_DEFINE0(getpid)
{
        return task_tgid_vnr(current);
}

The function task_pid_vnr is called by gettid. So, IIUC, the test should call
task_tgid_vnr, since getpid is being livepatched.

Am I missing something?

> 
> > +}
> > +
> > +static struct klp_func vmlinux_funcs[] = {
> > +	{
> > +		.old_name = __stringify(FN_PREFIX) "sys_getpid",
> > +		.new_func = lp_sys_getpid,
> > +	}, {}
> > +};
> > +
> > +static struct klp_object objs[] = {
> > +	{
> > +		/* name being NULL means vmlinux */
> > +		.funcs = vmlinux_funcs,
> > +	}, {}
> > +};
> > +
> > +static struct klp_patch patch = {
> > +	.mod = THIS_MODULE,
> > +	.objs = objs,
> > +};
> > +
> > +static int livepatch_init(void)
> > +{
> > +	int ret;
> > +	struct klp_pid_t *kpid;
> > +
> > +	if (npids > 0) {
> > +		int i;
> > +
> > +		for (i = 0; i < npids; i++) {
> > +			kpid = kmalloc(sizeof(struct klp_pid_t), GFP_KERNEL);
> > +			if (!kpid)
> > +				goto err_mem;
> > +
> > +			kpid->pid = klp_pids[i];
> > +			list_add(&kpid->list, &klp_pid_list);
> > +		}
> > +	}
> > +
> > +	klp_kobj = kobject_create_and_add("test_klp_syscall", kernel_kobj);
> > +	if (!klp_kobj)
> > +		goto err_mem;
> > +
> > +	ret = sysfs_create_file(klp_kobj, &klp_attr.attr);
> > +	if (ret) {
> > +		kobject_put(klp_kobj);
> > +		goto out_klp_pid_list;
> > +	}
> > +
> > +	return klp_enable_patch(&patch);
> > +
> > +err_mem:
> > +	ret = -ENOMEM;
> > +out_klp_pid_list:
> > +	free_klp_pid_list();
> > +
> > +	return ret;
> > +}
> > +
> > +static void livepatch_exit(void)
> > +{
> > +	free_klp_pid_list();
> > +	kobject_put(klp_kobj);
> > +}
> > +
> > +module_init(livepatch_init);
> > +module_exit(livepatch_exit);
> > +MODULE_LICENSE("GPL");
> > +MODULE_INFO(livepatch, "Y");
> 
> The linked list code looks like it should work and cleans up properly,
> but if the current test is the extent of what we need, it would be
> simpler to directly use klp_pids[] to track progress: since we're never
> going to livepatch pid 0, we could just search klp_pids[] for our pid
> and then zero it out once we've executed the livepatch code.
> 
> This would let us drop out the kmalloc and list management code (looks
> to be nearly 25% of the file at this point).  I think the kpid_mutex
> still needs to stay though.  Iterating through the fixed array shouldn't
> be too much more expensive than through the (shrinking) klp_pid_list.
> 
> On the other hand, if you think that we'll need to adapt this module to
> handle a dynamically changing list of klp_pids[], then perhaps a
> separate list is the way to go.

My first idea was to use klp_pids indexed by NR_CPUS, but I thought about bigger
machines with more CPUs, making the array too big for just a couple of pids to
be tracked...

Thinking again, I don't think it's a problem. Also, I don't believe that we
would need to dynamically add pids, so this code can go. In the next version
I'll just access klp_pids directly. Thanks for the suggestion!

> 
> Final note, kbuild is probably my weakest area, so I'm of limited help
> review-wise there.  In trying out the code, I wasn't sure exactly how to
> run things.  I did:
> 
>   $ make -C tools/testing/selftests/livepatch/ run_tests
> 
> and that built everything and ran the tests (good).  I changed a few .c
> files and tried again, but they didn't rebuild.  Then from
> tools/testing/selftests/livepatch/test_modules/ I ran `make` and it only
> ran the clean target (not good, I think).

I need to test it and make sure the code is rebuilt. Thanks for noticing this
problem.

> 
> What is the typical workflow supposed to be in this combined setup?

The plan was to support all forms of running the selftests: by running it in
place and by running the "installed" kselftests. I'll test it again to make sure
that the code is rebuilt when it should.

> 
> Thanks,
> 
> --
> Joe
>

Marcos Paulo de Souza Nov. 24, 2022, 3:39 a.m. UTC | #5

On Wed, Nov 23, 2022 at 10:35:27AM -0300, Marcos Paulo de Souza wrote:
> On Tue, Jul 12, 2022 at 10:56:11AM -0400, Joe Lawrence wrote:
> > On Thu, Jun 30, 2022 at 11:12:26AM -0300, Marcos Paulo de Souza wrote:
> > > Syscalls are called a tricky way. Some architectures add a prefix to the
> > > syscall name (SYSCALL_WRAPPER).
> > > 
> > > This new test creates one userspace process per online cpu calling getpid
> > > continuously and tries to livepatch the getpid function. Add the correct
> > > function prefix for all archs that select HAS_SYSCALL_WRAPPER.
> > > 
> > > Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> > 
> > Hi Marcos,
> > 
> > Thanks for working on this and sorry for the delayed reply.  I gave it a
> > spin yesterday and have a few comments below...
> 
> It has been sometime since the last version, thanks for the comments. I have
> some questions bellow.
> 
> > 
> > > ---
> > >  tools/testing/selftests/livepatch/Makefile    |  12 +-
> > >  .../selftests/livepatch/test-syscall.sh       |  52 ++++++
> > >  .../test_binaries/test_klp-call_getpid.c      |  48 ++++++
> > >  .../selftests/livepatch/test_modules/Makefile |   3 +-
> > >  .../livepatch/test_modules/test_klp_syscall.c | 150 ++++++++++++++++++
> > >  5 files changed, 261 insertions(+), 4 deletions(-)
> > >  create mode 100755 tools/testing/selftests/livepatch/test-syscall.sh
> > >  create mode 100644 tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> > >  create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > > 
> > > diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile
> > > index 5ef492b87bb1..35014197184e 100644
> > > --- a/tools/testing/selftests/livepatch/Makefile
> > > +++ b/tools/testing/selftests/livepatch/Makefile
> > > @@ -1,10 +1,14 @@
> > >  # SPDX-License-Identifier: GPL-2.0
> > > +include ../../../build/Build.include
> > > +include ../../../scripts/Makefile.arch
> > > +include ../../../scripts/Makefile.include
> > >  
> > >  TEST_FILES := settings \
> > >  		test_modules
> > >  
> > >  # We need the test_modules dir in order to make gen_tar and install to work
> > > -TEST_GEN_PROGS_EXTENDED := test_modules/test_klp_atomic_replace.ko \
> > > +TEST_GEN_PROGS_EXTENDED := test_binaries/test_klp-call_getpid \
> > 
> > Before I commit this to muscle memory, would "test_programs" better
> > describe this directory?  When I see "binaries", I think of source vs.
> > build and not kernel vs. userspace.
> 
> Done locally.
> 
> > 
> > > +			test_modules/test_klp_atomic_replace.ko \
> > >  			test_modules/test_klp_callbacks_busy.ko \
> > >  			test_modules/test_klp_callbacks_demo.ko \
> > >  			test_modules/test_klp_callbacks_demo2.ko \
> > > @@ -13,7 +17,8 @@ TEST_GEN_PROGS_EXTENDED := test_modules/test_klp_atomic_replace.ko \
> > >  			test_modules/test_klp_state.ko \
> > >  			test_modules/test_klp_state2.ko \
> > >  			test_modules/test_klp_state3.ko \
> > > -			test_modules/test_klp_shadow_vars.ko
> > > +			test_modules/test_klp_shadow_vars.ko \
> > > +			test_modules/test_klp_syscall.ko
> > >  
> > >  TEST_PROGS_EXTENDED := functions.sh
> > >  TEST_PROGS := \
> > > @@ -21,7 +26,8 @@ TEST_PROGS := \
> > >  	test-callbacks.sh \
> > >  	test-shadow-vars.sh \
> > >  	test-state.sh \
> > > -	test-ftrace.sh
> > > +	test-ftrace.sh \
> > > +	test-syscall.sh
> > >  
> > >  # override lib.mk's default rules
> > >  OVERRIDE_TARGETS := 1
> > > diff --git a/tools/testing/selftests/livepatch/test-syscall.sh b/tools/testing/selftests/livepatch/test-syscall.sh
> > > new file mode 100755
> > > index 000000000000..fb3270de7f1f
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/livepatch/test-syscall.sh
> > > @@ -0,0 +1,52 @@
> > > +#!/bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (C) 2022 SUSE
> > > +# Author: Marcos Paulo de Souza <mpdesouza@suse.com>
> > > +
> > > +. $(dirname $0)/functions.sh
> > > +
> > > +MOD_SYSCALL=test_klp_syscall
> > > +
> > > +setup_config
> > > +
> > > +# - Start _NRPROC processes calling getpid and load a livepatch to patch the
> > > +#   getpid syscall
> > > +
> > > +start_test "patch getpid syscall while being heavily hammered"
> > > +
> > > +for i in $(seq 0 $(( $(getconf _NPROCESSORS_ONLN) -1)) ); do
> > > +	./test_binaries/test_klp-call_getpid &
> > > +	pids[$i]="$!"
> > > +done
> > 
> > On second thought, bash should only iterate through the elements that
> > are defined, so your earlier version is probably better:
> > 
> >   for i in $(seq 1 $(getconf _NPROCESSORS_ONLN)); do
> > 
> > > +
> > > +pid_list=$(echo ${pids[@]} | tr ' ' ',')
> > > +load_lp $MOD_SYSCALL klp_pids=$pid_list
> > > +
> > 
> > We could wait for all test_klp-call_getpid instances to execute the
> > livepatch code:
> > 
> >   loop_until 'grep -q '^0$' /sys/kernel/test_klp_syscall/npids'
> > 
> > This has a built in timeout, so it's not infinitely looping.  You
> > could add some die "reason" logic here, or just let the existing code
> > cleanup after the failed test.
> 
> Makes sense. Done locally.
> 
> > 
> > > +pending_pids=$(cat /sys/kernel/test_klp_syscall/npids)
> > > +
> > > +for pid in ${pids[@]}; do
> > > +	kill $pid || true
> > > +done
> > > +
> > > +disable_lp $MOD_SYSCALL
> > > +unload_lp $MOD_SYSCALL
> > > +
> > > +if [[ "$pending_pids" != "0" ]]; then
> > > +	echo -e "FAIL\n\n"
> > > +	die "processes not livepatched: $pending_pids"
> > > +fi
> > > +
> > > +check_result "% insmod test_modules/$MOD_SYSCALL.ko klp_pids=$pid_list
> > > +livepatch: enabling patch '$MOD_SYSCALL'
> > > +livepatch: '$MOD_SYSCALL': initializing patching transition
> > > +livepatch: '$MOD_SYSCALL': starting patching transition
> > > +livepatch: '$MOD_SYSCALL': completing patching transition
> > > +livepatch: '$MOD_SYSCALL': patching complete
> > > +% echo 0 > /sys/kernel/livepatch/$MOD_SYSCALL/enabled
> > > +livepatch: '$MOD_SYSCALL': initializing unpatching transition
> > > +livepatch: '$MOD_SYSCALL': starting unpatching transition
> > > +livepatch: '$MOD_SYSCALL': completing unpatching transition
> > > +livepatch: '$MOD_SYSCALL': unpatching complete
> > > +% rmmod $MOD_SYSCALL"
> > > +
> > > +exit 0
> > > diff --git a/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c b/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> > > new file mode 100644
> > > index 000000000000..ec470660dee6
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/livepatch/test_binaries/test_klp-call_getpid.c
> > > @@ -0,0 +1,48 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2017 SUSE
> > > + * Author: Libor Pechacek <lpechacek@suse.cz>
> > > + */
> > > +
> > > +#include <stdio.h>
> > > +#include <unistd.h>
> > > +#include <sys/syscall.h>
> > > +#include <sys/types.h>
> > > +#include <signal.h>
> > > +
> > > +static int stop = 0;
> > 
> > nit: no need to init global to 0
> 
> Indeed. Fixed.
> 
> > 
> > > +static int sig_int;
> > > +
> > > +void hup_handler(int signum)
> > > +{
> > > +	stop = 1;
> > > +}
> > > +
> > > +void int_handler(int signum)
> > > +{
> > > +	stop = 1;
> > > +	sig_int = 1;
> > > +}
> > > +
> > > +int main(int argc, char *argv[])
> > > +{
> > > +	pid_t orig_pid, pid;
> > > +	long count = 0;
> > > +
> > > +	signal(SIGHUP, &hup_handler);
> > > +	signal(SIGINT, &int_handler);
> > > +
> > > +	orig_pid = syscall(SYS_getpid);
> > > +
> > > +	while (!stop) {
> > > +		pid = syscall(SYS_getpid);
> > > +		if (pid != orig_pid)
> > > +			return 1;
> > 
> > This test doesn't care about the user program return code, but I wonder
> > if the status should be flipped -- this is the desired code path, not
> > the one at the end of main(), right?
> 
> I got the code from our tests. IIUC, syscall with SYS_getpid cannot fail. I'll
> double check before sending v3.
> 
> > 
> > > +		count++;
> > > +	}
> > > +
> > > +	if (sig_int)
> > > +		printf("%ld iterations done\n", count);
> > > +
> > > +	return 0;
> > > +}
> > > diff --git a/tools/testing/selftests/livepatch/test_modules/Makefile b/tools/testing/selftests/livepatch/test_modules/Makefile
> > > index 1eab43b741cd..ebb754accf46 100644
> > > --- a/tools/testing/selftests/livepatch/test_modules/Makefile
> > > +++ b/tools/testing/selftests/livepatch/test_modules/Makefile
> > > @@ -10,7 +10,8 @@ obj-m += test_klp_atomic_replace.o \
> > >  	test_klp_state.o \
> > >  	test_klp_state2.o \
> > >  	test_klp_state3.o \
> > > -	test_klp_shadow_vars.o
> > > +	test_klp_shadow_vars.o \
> > > +	test_klp_syscall.o
> > >  
> > >  %.ko:
> > >  	make -C $(KDIR) M=$(TESTMODS_DIR) $@
> > > diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > > new file mode 100644
> > > index 000000000000..e90f4ac8e7a4
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > > @@ -0,0 +1,150 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2017-2022 SUSE
> > > + * Authors: Libor Pechacek <lpechacek@suse.cz>
> > > + *          Nicolai Stange <nstange@suse.de>
> > > + *          Marcos Paulo de Souza <mpdesouza@suse.com>
> > > + */
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/kernel.h>
> > > +#include <linux/sched.h>
> > > +#include <linux/slab.h>
> > > +#include <linux/livepatch.h>
> > > +
> > > +#if defined(__x86_64__)
> > > +#define FN_PREFIX __x64_
> > > +#elif defined(__s390x__)
> > > +#define FN_PREFIX __s390x_
> > > +#elif defined(__aarch64__)
> > > +#define FN_PREFIX __arm64_
> > > +#else
> > > +/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER */
> > > +#define FN_PREFIX
> > > +#endif
> > > +
> > > +struct klp_pid_t {
> > > +	pid_t pid;
> > > +	struct list_head list;
> > > +};
> > > +static LIST_HEAD(klp_pid_list);
> > > +
> > > +/* Protects klp_pid_list */
> > > +static DEFINE_MUTEX(kpid_mutex);
> > > +
> > > +static int klp_pids[NR_CPUS];
> > > +static unsigned int npids;
> > > +module_param_array(klp_pids, int, &npids, 0);
> > > +
> > > +static ssize_t npids_show(struct kobject *kobj, struct kobj_attribute *attr,
> > > +			  char *buf)
> > > +{
> > > +	return sprintf(buf, "%u\n", npids);
> > > +}
> > > +
> > > +static struct kobj_attribute klp_attr = __ATTR_RO(npids);
> > > +static struct kobject *klp_kobj;
> > > +
> > > +static void free_klp_pid_list(void)
> > > +{
> > > +	struct klp_pid_t *kpid, *temp;
> > > +
> > > +	mutex_lock(&kpid_mutex);
> > > +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> > > +		list_del(&kpid->list);
> > > +		kfree(kpid);
> > > +	}
> > > +	mutex_unlock(&kpid_mutex);
> > > +}
> > > +
> > > +asmlinkage long lp_sys_getpid(void)
> > > +{
> > > +	struct klp_pid_t *kpid, *temp;
> > > +
> > > +	/*
> > > +	 * For each thread calling getpid, check if the pid exists in
> > > +	 * klp_pid_list. If yes, decrement the npids variable and remove the pid
> > > +	 * from the list. npids will be later used to ensure that all pids
> > > +	 * transitioned to the liveaptched state.
> > 
> > I go back and forth with this comment: 1) the code is pretty
> > self-explanatory and 2) it explains what test-syscall.sh is doing with
> > npids
> > 
> > Describing the behavioral difference I suggest below would be helpful
> > (if you agree w/the change).  Something about removing the current pid
> > from the list, okay, too.
> 
> Makes sense. I'll move the comment to the test.
> 
> > 
> > Then as far as describing npids' purpose, I'd put that up at the top of
> > the file where npids is defined.  It can be as generic as describing
> > what it's counting.
> 
> Thanks for the suggestion. A short description will be added when defining the
> module argument.
> 
> > 
> > > +	 */
> > > +	mutex_lock(&kpid_mutex);
> > > +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> > > +		if (current->pid == kpid->pid) {
> > > +			list_del(&kpid->list);
> > > +			kfree(kpid);
> > > +			npids--;
> > > +			break;
> > 
> > I think it would be safer to return task_tgid_vnr() here, but ...
> > 
> > > +		}
> > > +	}
> > > +	mutex_unlock(&kpid_mutex);
> > > +
> > > +	return task_tgid_vnr(current);
> > 
> > task_pid_vnr() here.  That way we're only changing behavior for the
> > processes in the list and not all programs across the system.
> 
> So, looking at getpid syscall definition, it calls task_tgid_vnr:
> 
> SYSCALL_DEFINE0(getpid)
> {
>         return task_tgid_vnr(current);
> }
> 
> The function task_pid_vnr is called by gettid. So, IIUC, the test should call
> task_tgid_vnr, since getpid is being livepatched.
> 
> Am I missing something?
> 
> > 
> > > +}
> > > +
> > > +static struct klp_func vmlinux_funcs[] = {
> > > +	{
> > > +		.old_name = __stringify(FN_PREFIX) "sys_getpid",
> > > +		.new_func = lp_sys_getpid,
> > > +	}, {}
> > > +};
> > > +
> > > +static struct klp_object objs[] = {
> > > +	{
> > > +		/* name being NULL means vmlinux */
> > > +		.funcs = vmlinux_funcs,
> > > +	}, {}
> > > +};
> > > +
> > > +static struct klp_patch patch = {
> > > +	.mod = THIS_MODULE,
> > > +	.objs = objs,
> > > +};
> > > +
> > > +static int livepatch_init(void)
> > > +{
> > > +	int ret;
> > > +	struct klp_pid_t *kpid;
> > > +
> > > +	if (npids > 0) {
> > > +		int i;
> > > +
> > > +		for (i = 0; i < npids; i++) {
> > > +			kpid = kmalloc(sizeof(struct klp_pid_t), GFP_KERNEL);
> > > +			if (!kpid)
> > > +				goto err_mem;
> > > +
> > > +			kpid->pid = klp_pids[i];
> > > +			list_add(&kpid->list, &klp_pid_list);
> > > +		}
> > > +	}
> > > +
> > > +	klp_kobj = kobject_create_and_add("test_klp_syscall", kernel_kobj);
> > > +	if (!klp_kobj)
> > > +		goto err_mem;
> > > +
> > > +	ret = sysfs_create_file(klp_kobj, &klp_attr.attr);
> > > +	if (ret) {
> > > +		kobject_put(klp_kobj);
> > > +		goto out_klp_pid_list;
> > > +	}
> > > +
> > > +	return klp_enable_patch(&patch);
> > > +
> > > +err_mem:
> > > +	ret = -ENOMEM;
> > > +out_klp_pid_list:
> > > +	free_klp_pid_list();
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void livepatch_exit(void)
> > > +{
> > > +	free_klp_pid_list();
> > > +	kobject_put(klp_kobj);
> > > +}
> > > +
> > > +module_init(livepatch_init);
> > > +module_exit(livepatch_exit);
> > > +MODULE_LICENSE("GPL");
> > > +MODULE_INFO(livepatch, "Y");
> > 
> > The linked list code looks like it should work and cleans up properly,
> > but if the current test is the extent of what we need, it would be
> > simpler to directly use klp_pids[] to track progress: since we're never
> > going to livepatch pid 0, we could just search klp_pids[] for our pid
> > and then zero it out once we've executed the livepatch code.
> > 
> > This would let us drop out the kmalloc and list management code (looks
> > to be nearly 25% of the file at this point).  I think the kpid_mutex
> > still needs to stay though.  Iterating through the fixed array shouldn't
> > be too much more expensive than through the (shrinking) klp_pid_list.
> > 
> > On the other hand, if you think that we'll need to adapt this module to
> > handle a dynamically changing list of klp_pids[], then perhaps a
> > separate list is the way to go.
> 
> My first idea was to use klp_pids indexed by NR_CPUS, but I thought about bigger
> machines with more CPUs, making the array too big for just a couple of pids to
> be tracked...

So, I already have klp_pids being defines using NR_CPUS, so forget what I said
earlier. I'm changing the code locally to address klp_pids without using a
linked list.

> 
> Thinking again, I don't think it's a problem. Also, I don't believe that we
> would need to dynamically add pids, so this code can go. In the next version
> I'll just access klp_pids directly. Thanks for the suggestion!
> 
> > 
> > Final note, kbuild is probably my weakest area, so I'm of limited help
> > review-wise there.  In trying out the code, I wasn't sure exactly how to
> > run things.  I did:
> > 
> >   $ make -C tools/testing/selftests/livepatch/ run_tests
> > 
> > and that built everything and ran the tests (good).  I changed a few .c
> > files and tried again, but they didn't rebuild.  Then from
> > tools/testing/selftests/livepatch/test_modules/ I ran `make` and it only
> > ran the clean target (not good, I think).
> 
> I need to test it and make sure the code is rebuilt. Thanks for noticing this
> problem.
> 
> > 
> > What is the typical workflow supposed to be in this combined setup?
> 
> The plan was to support all forms of running the selftests: by running it in
> place and by running the "installed" kselftests. I'll test it again to make sure
> that the code is rebuilt when it should.
> 
> > 
> > Thanks,
> > 
> > --
> > Joe
> >

Marcos Paulo de Souza Nov. 24, 2022, 1:05 p.m. UTC | #6

On Tue, Jul 12, 2022 at 10:56:11AM -0400, Joe Lawrence wrote:
> On Thu, Jun 30, 2022 at 11:12:26AM -0300, Marcos Paulo de Souza wrote:
...
> nit: no need to init global to 0
> 
> > +static int sig_int;
> > +
> > +void hup_handler(int signum)
> > +{
> > +	stop = 1;
> > +}
> > +
> > +void int_handler(int signum)
> > +{
> > +	stop = 1;
> > +	sig_int = 1;
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +	pid_t orig_pid, pid;
> > +	long count = 0;
> > +
> > +	signal(SIGHUP, &hup_handler);
> > +	signal(SIGINT, &int_handler);
> > +
> > +	orig_pid = syscall(SYS_getpid);
> > +
> > +	while (!stop) {
> > +		pid = syscall(SYS_getpid);
> > +		if (pid != orig_pid)
> > +			return 1;
> 
> This test doesn't care about the user program return code, but I wonder
> if the status should be flipped -- this is the desired code path, not
> the one at the end of main(), right?
> 
...
> > +	 */
> > +	mutex_lock(&kpid_mutex);
> > +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
> > +		if (current->pid == kpid->pid) {
> > +			list_del(&kpid->list);
> > +			kfree(kpid);
> > +			npids--;
> > +			break;
> 
> I think it would be safer to return task_tgid_vnr() here, but ...
> 
> > +		}
> > +	}
> > +	mutex_unlock(&kpid_mutex);
> > +
> > +	return task_tgid_vnr(current);
> 
> task_pid_vnr() here.  That way we're only changing behavior for the
> processes in the list and not all programs across the system.

I believe that these two suggestions can be linked per your answer. First of
all, I didn't write the original test program, but I agree that we can make it
better.

My intent by upstreaming the test was to ensure that test programs
would keep working even when livepatching getpid while having processes calling
getpid nonstop. For the purpose of the test, the test module livepatches getpid,
but keeping the same behavior as before. The only change is to keep track of the
test programs that need to transition to livepatched state.

Per your comment on the test program it seems that we expected to receive a
different value from getpid, but it's not the case here. I believe the chec on
test program is confusing and doesn't bring any benefit, so maybe it's better to
remove it and keep the test even simpler:

  --- a/tools/testing/selftests/livepatch/test_programs/test_klp-call_getpid.c
  +++ b/tools/testing/selftests/livepatch/test_programs/test_klp-call_getpid.c
  @@ -26,18 +26,13 @@ void int_handler(int signum)

   int main(int argc, char *argv[])
   {
  -   pid_t orig_pid, pid;
      long count = 0;

      signal(SIGHUP, &hup_handler);
      signal(SIGINT, &int_handler);

  -   orig_pid = syscall(SYS_getpid);
  -
      while (!stop) {
  -       pid = syscall(SYS_getpid);
  -       if (pid != orig_pid)
  -           return 1;
  +       (void)syscall(SYS_getpid);
          count++;
      }

As as only care about the processes transitioning to the livepatched state, we
really don't care about the getpid return value (as it should be the same from patches
and unpatched state).

I believe this resolves both issues. What do you think?

Joe Lawrence Nov. 30, 2022, 10:19 p.m. UTC | #7

On 11/24/22 8:05 AM, Marcos Paulo de Souza wrote:
> On Tue, Jul 12, 2022 at 10:56:11AM -0400, Joe Lawrence wrote:
>> On Thu, Jun 30, 2022 at 11:12:26AM -0300, Marcos Paulo de Souza wrote:
> ...
>> nit: no need to init global to 0
>>
>>> +static int sig_int;
>>> +
>>> +void hup_handler(int signum)
>>> +{
>>> +	stop = 1;
>>> +}
>>> +
>>> +void int_handler(int signum)
>>> +{
>>> +	stop = 1;
>>> +	sig_int = 1;
>>> +}
>>> +
>>> +int main(int argc, char *argv[])
>>> +{
>>> +	pid_t orig_pid, pid;
>>> +	long count = 0;
>>> +
>>> +	signal(SIGHUP, &hup_handler);
>>> +	signal(SIGINT, &int_handler);
>>> +
>>> +	orig_pid = syscall(SYS_getpid);
>>> +
>>> +	while (!stop) {
>>> +		pid = syscall(SYS_getpid);
>>> +		if (pid != orig_pid)
>>> +			return 1;
>>
>> This test doesn't care about the user program return code, but I wonder
>> if the status should be flipped -- this is the desired code path, not
>> the one at the end of main(), right?
>>
> ...

Ah, nevermind ...

>>> +	 */
>>> +	mutex_lock(&kpid_mutex);
>>> +	list_for_each_entry_safe(kpid, temp, &klp_pid_list, list) {
>>> +		if (current->pid == kpid->pid) {
>>> +			list_del(&kpid->list);
>>> +			kfree(kpid);
>>> +			npids--;
>>> +			break;
>>
>> I think it would be safer to return task_tgid_vnr() here, but ...
>>
>>> +		}
>>> +	}
>>> +	mutex_unlock(&kpid_mutex);
>>> +
>>> +	return task_tgid_vnr(current);
>>
>> task_pid_vnr() here.  That way we're only changing behavior for the
>> processes in the list and not all programs across the system.
> 
> I believe that these two suggestions can be linked per your answer. First of
> all, I didn't write the original test program, but I agree that we can make it
> better.
> 
> My intent by upstreaming the test was to ensure that test programs
> would keep working even when livepatching getpid while having processes calling
> getpid nonstop. For the purpose of the test, the test module livepatches getpid,
> but keeping the same behavior as before. The only change is to keep track of the
> test programs that need to transition to livepatched state.
> 
> Per your comment on the test program it seems that we expected to receive a
> different value from getpid, but it's not the case here. I believe the chec on
> test program is confusing and doesn't bring any benefit, so maybe it's better to
> remove it and keep the test even simpler:
> 

Now that I re-read it months later, I see that I mistakenly thought that
the livepatch was supposed to modify the return value.  So disregard my
earlier suggestions / questions about that.

>   --- a/tools/testing/selftests/livepatch/test_programs/test_klp-call_getpid.c
>   +++ b/tools/testing/selftests/livepatch/test_programs/test_klp-call_getpid.c
>   @@ -26,18 +26,13 @@ void int_handler(int signum)
> 
>    int main(int argc, char *argv[])
>    {
>   -   pid_t orig_pid, pid;
>       long count = 0;
> 
>       signal(SIGHUP, &hup_handler);
>       signal(SIGINT, &int_handler);
> 
>   -   orig_pid = syscall(SYS_getpid);
>   -
>       while (!stop) {
>   -       pid = syscall(SYS_getpid);
>   -       if (pid != orig_pid)
>   -           return 1;
>   +       (void)syscall(SYS_getpid);
>           count++;
>       }
> 
> As as only care about the processes transitioning to the livepatched state, we
> really don't care about the getpid return value (as it should be the same from patches
> and unpatched state).
> 
> I believe this resolves both issues. What do you think?
> 

Verifying orig_pid against the current pid return value would verify
that it was cleanly patched... though that seems incidental to the
purpose of the test.  The simplification above is ok with me either way.

Joe Lawrence Nov. 30, 2022, 10:22 p.m. UTC | #8

On 7/15/22 10:45 AM, Petr Mladek wrote:
> On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
>> On 7/1/22 1:48 AM, Miroslav Benes wrote:
>>> On Thu, 30 Jun 2022, Shuah Khan wrote:
>>>>
>>>> Sorry Nack on this. Let's not add modules under selftests. Any usage of
>>>> module_init()
>>>> doesn't belong under selftests.
>>
>> Yes I did and after reviewing and thinking about it some more, I decided this
>> is the right direction go down on.
> 
> Do you have some particular reason why building modules in selftests
> directory might cause problems, please?
> 
> IMHO, the reason that the test modules are in lib is because the
> modules were there before selftests. Developers historically loaded them
> manually or they were built-in. Selftest were added later and are just
> another way how the module can be loaded. This is the case,
> for example, for lib/test_printf.c.
> 
> Otherwise, I do not see any big difference between building binaries
> and modules under tools/tests/selftests. As I said, in the older
> thread, IMHO, it makes more sense to have the selftest sources
> self-contained.
> 
> 
> There actually seems to be a principal problem in the following use
> case:
> 
> --- cut Documentation/dev-tools/kselftest.rst ---
> Kselftest from mainline can be run on older stable kernels. Running tests
> from mainline offers the best coverage. Several test rings run mainline
> kselftest suite on stable releases. The reason is that when a new test
> gets added to test existing code to regression test a bug, we should be
> able to run that test on an older kernel. Hence, it is important to keep
> code that can still test an older kernel and make sure it skips the test
> gracefully on newer releases.
> --- cut Documentation/dev-tools/kselftest.rst ---
> 
> together with
> 
> --- cut Documentation/dev-tools/kselftest.rst ---
>  * First use the headers inside the kernel source and/or git repo, and then the
>    system headers.  Headers for the kernel release as opposed to headers
>    installed by the distro on the system should be the primary focus to be able
>    to find regressions.
> --- cut Documentation/dev-tools/kselftest.rst ---
> 
> It means that selftests should support running binaries built against
> newer kernel sources on system running older kernel. But this might
> be pretty hard to achieve and maintain.
> 
> The normal kernel rules are exactly the opposite. Old binaries must
> be able to run on newer kernels. The old binaries were built against
> older headers.
> 
> IMHO, the testing of stable kernels makes perfect sense. And if we
> want to support it seriously than we need to allow building new
> selftests against headers from the old to-be-tested kernel. And
> it will be possible only when the selftests sources are as much
> selfcontained as possible.
> 
> Does this makes any sense, please?
> 

Gentle bump.  Shuah, I believe that Marcos will be preparing a v3 based
on review comments on the second patch.  We never resolved questions
surrounding building modules selftests/ (the first patch) though.

Regards,

Shuah Khan Dec. 1, 2022, 11:58 p.m. UTC | #9

On 11/30/22 15:22, Joe Lawrence wrote:
> On 7/15/22 10:45 AM, Petr Mladek wrote:
>> On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
>>> On 7/1/22 1:48 AM, Miroslav Benes wrote:
>>>> On Thu, 30 Jun 2022, Shuah Khan wrote:
>>>>>
>>>>> Sorry Nack on this. Let's not add modules under selftests. Any usage of
>>>>> module_init()
>>>>> doesn't belong under selftests.
>>>
>>> Yes I did and after reviewing and thinking about it some more, I decided this
>>> is the right direction go down on.
>>
>> Do you have some particular reason why building modules in selftests
>> directory might cause problems, please?
>>

My reasons are that with this change module_init() propagates out of
strictly kernel space and now is in selftests which are user-space.
Any changes to this interface will be tied to user-space change.

This is my main concern. That is reason why I still ask the question
about why is it necessary to make this change other than self-contained
sources?

>> IMHO, the reason that the test modules are in lib is because the
>> modules were there before selftests. Developers historically loaded them
>> manually or they were built-in. Selftest were added later and are just
>> another way how the module can be loaded. This is the case,
>> for example, for lib/test_printf.c.
>>
>> Otherwise, I do not see any big difference between building binaries
>> and modules under tools/tests/selftests. As I said, in the older
>> thread, IMHO, it makes more sense to have the selftest sources
>> self-contained.

Modules under lib are built when kernel gets built as opposed to when
tests are built. So there is the difference in build order. I do see
a difference from that point of view.

Yes, moving modules under selftests does make the tests self contained.

>>
>>
>> There actually seems to be a principal problem in the following use
>> case:
>>
>> --- cut Documentation/dev-tools/kselftest.rst ---
>> Kselftest from mainline can be run on older stable kernels. Running tests
>> from mainline offers the best coverage. Several test rings run mainline
>> kselftest suite on stable releases. The reason is that when a new test
>> gets added to test existing code to regression test a bug, we should be
>> able to run that test on an older kernel. Hence, it is important to keep
>> code that can still test an older kernel and make sure it skips the test
>> gracefully on newer releases.
>> --- cut Documentation/dev-tools/kselftest.rst ---
>>
>> together with
>>
>> --- cut Documentation/dev-tools/kselftest.rst ---
>>   * First use the headers inside the kernel source and/or git repo, and then the
>>     system headers.  Headers for the kernel release as opposed to headers
>>     installed by the distro on the system should be the primary focus to be able
>>     to find regressions.
>> --- cut Documentation/dev-tools/kselftest.rst ---
>>
>> It means that selftests should support running binaries built against
>> newer kernel sources on system running older kernel. But this might
>> be pretty hard to achieve and maintain.
>>
>> The normal kernel rules are exactly the opposite. Old binaries must
>> be able to run on newer kernels. The old binaries were built against
>> older headers.
>>

This case is applicable to when tests are built on a development system
and binaries are moved to run on a target system.

In general, newer tests offer the best coverage, hence the recommendation
to run newer tests on older kernels assuming that the tests are built
on a newer kernel and backwards should run in a backwards compatible
way on older kernels.

Your use-case might be different from this where you do build tests
on older kernels and run them on it in which case, you might have a
requirement to revision match the tests and kernel. You can still do
so.

>> IMHO, the testing of stable kernels makes perfect sense. And if we
>> want to support it seriously than we need to allow building new
>> selftests against headers from the old to-be-tested kernel. And
>> it will be possible only when the selftests sources are as much
>> selfcontained as possible.
>>

Do you have a requirement that livepatch test has to be revision
matched with the kernel? Even if that is the case, there is no real
reason to move modules under selftests other than keeping them in
one location.

Also I didn't see any changes to README that explains this move and
that this change now makes this test now depends on kernel only
interfaces and hence will have to follow modules built outside of
kernel build.

>> Does this makes any sense, please?
>>
> 
> Gentle bump.  Shuah, I believe that Marcos will be preparing a v3 based
> on review comments on the second patch.  We never resolved questions
> surrounding building modules selftests/ (the first patch) though.
> 

You can send patches again and I would like to hear good reasons other
than self-containing the sources.

thanks,
-- Shuah

Miroslav Benes Dec. 2, 2022, 7:33 a.m. UTC | #10

On Thu, 1 Dec 2022, Shuah Khan wrote:

> On 11/30/22 15:22, Joe Lawrence wrote:
> > On 7/15/22 10:45 AM, Petr Mladek wrote:
> >> On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
> >>> On 7/1/22 1:48 AM, Miroslav Benes wrote:
> >>>> On Thu, 30 Jun 2022, Shuah Khan wrote:
> >>>>>
> >>>>> Sorry Nack on this. Let's not add modules under selftests. Any usage of
> >>>>> module_init()
> >>>>> doesn't belong under selftests.
> >>>
> >>> Yes I did and after reviewing and thinking about it some more, I decided
> >>> this
> >>> is the right direction go down on.
> >>
> >> Do you have some particular reason why building modules in selftests
> >> directory might cause problems, please?
> >>
> 
> My reasons are that with this change module_init() propagates out of
> strictly kernel space and now is in selftests which are user-space.
> Any changes to this interface will be tied to user-space change.

I do not understand this (have not had a cup of tea yet).

module_init() is defined in include/linux/module.h. It is API. Every 
kernel module, in-tree or out-of-tree, uses it. There is only one usage of 
module_init() in Marcos's patch. In a kernel module, in tools/ 
subdirectory yes.

If there is a change to module_init, it might need editing all the call 
sites, yes. That is inherent.

> This is my main concern. That is reason why I still ask the question
> about why is it necessary to make this change other than self-contained
> sources?

I will quote myself from an earlier email in the thread which you have not 
replied to...

"
My main question is different though. As Marcos mentioned before, we would 
like to have our tests really flexible and a possibility to prepare and 
load different live patch modules based on a template is a part of it. 
What is your proposal regarding this? I can imagine having a template in 
lib/livepatch/ which would not be compilable and a script in 
tools/testing/selftests/livepatch/ would copy it many times, amend the 
copies (meaning parameters would be filled in with sed or the code would 
be changed), compile them and load them. But this sounds horrible to me, 
especially when compared to Marcos' approach in this patch set which is 
quite straightforward.
"

Thank you
Miroslav

Petr Mladek Dec. 2, 2022, 9:25 a.m. UTC | #11

On Thu 2022-12-01 16:58:38, Shuah Khan wrote:
> On 11/30/22 15:22, Joe Lawrence wrote:
> > On 7/15/22 10:45 AM, Petr Mladek wrote:
> > > On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
> > > > On 7/1/22 1:48 AM, Miroslav Benes wrote:
> > > > > On Thu, 30 Jun 2022, Shuah Khan wrote:
> > > > > > 
> > > > > > Sorry Nack on this. Let's not add modules under selftests. Any usage of
> > > > > > module_init()
> > > > > > doesn't belong under selftests.
> > > > 
> > > > Yes I did and after reviewing and thinking about it some more, I decided this
> > > > is the right direction go down on.
> > > 
> > > Do you have some particular reason why building modules in selftests
> > > directory might cause problems, please?
> > > 
> 
> My reasons are that with this change module_init() propagates out of
> strictly kernel space and now is in selftests which are user-space.
> Any changes to this interface will be tied to user-space change.

I am sorry but I do not understand the meaning here. module_init() is
called when module is loaded. It is not called in userspace.

Maybe, you mean that modules under lib/ are clearly in-tree
modules. If we move then under tools/ then they will be build
like out-of-tree modules. Except that they will be maintained in-tree
so that it will be easy to keep them in sync.

And I am sure that they will be actively maintained. The fixes are
there to make sure that livepatching still works as expected.
They must pass when any change is done in the livepatch subsystem.
And they must pass when any kernel is released.

The only concern might be how build failure is handled. IMHO, we
need to handle it the same way and test failure.

> In general, newer tests offer the best coverage, hence the recommendation
> to run newer tests on older kernels assuming that the tests are built
> on a newer kernel and backwards should run in a backwards compatible
> way on older kernels.

This works for the userspace interface that should always be backward
compatible. But it does not work for kABI.

> Do you have a requirement that livepatch test has to be revision
> matched with the kernel? Even if that is the case, there is no real
> reason to move modules under selftests other than keeping them in
> one location.

Yes, kABI is not backward compatible. But building the tests
modules out-of-tree way would allow to build test modules with
different kABI from the same sources.

IMHO, this is common for any selttests using kernel modules.
I am sure that neither test_printf nor test_scanf modules
could be loaded in older kernels. Even though it might
be useful to run improved tests for stable-like kernels.

Best Regards,
Petr

Shuah Khan Dec. 2, 2022, 8:03 p.m. UTC | #12

On 12/2/22 02:25, Petr Mladek wrote:
> On Thu 2022-12-01 16:58:38, Shuah Khan wrote:
>> On 11/30/22 15:22, Joe Lawrence wrote:
>>> On 7/15/22 10:45 AM, Petr Mladek wrote:
>>>> On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
>>>>> On 7/1/22 1:48 AM, Miroslav Benes wrote:
>>>>>> On Thu, 30 Jun 2022, Shuah Khan wrote:
>>>>>>>
>>>>>>> Sorry Nack on this. Let's not add modules under selftests. Any usage of
>>>>>>> module_init()
>>>>>>> doesn't belong under selftests.
>>>>>
>>>>> Yes I did and after reviewing and thinking about it some more, I decided this
>>>>> is the right direction go down on.
>>>>
>>>> Do you have some particular reason why building modules in selftests
>>>> directory might cause problems, please?
>>>>
>>
>> My reasons are that with this change module_init() propagates out of
>> strictly kernel space and now is in selftests which are user-space.
>> Any changes to this interface will be tied to user-space change.
> 
> I am sorry but I do not understand the meaning here. module_init() is
> called when module is loaded. It is not called in userspace.
> 
> Maybe, you mean that modules under lib/ are clearly in-tree
> modules. If we move then under tools/ then they will be build
> like out-of-tree modules. Except that they will be maintained in-tree
> so that it will be easy to keep them in sync.
> 

Yes. That is what I mean.

> And I am sure that they will be actively maintained. The fixes are
> there to make sure that livepatching still works as expected.
> They must pass when any change is done in the livepatch subsystem.
> And they must pass when any kernel is released.
> 

In other words, livepatch and kernel are revision matched.

> The only concern might be how build failure is handled. IMHO, we
> need to handle it the same way and test failure.
> 
> 

Yes. This is another concern - build failures. Let's experiment with
modules under selftests and see if this becomes a problem.

>> In general, newer tests offer the best coverage, hence the recommendation
>> to run newer tests on older kernels assuming that the tests are built
>> on a newer kernel and backwards should run in a backwards compatible
>> way on older kernels.
> 
> This works for the userspace interface that should always be backward
> compatible. But it does not work for kABI.
> 

This is broader than revision matching. Tests should gracefully exit
with skip when a config option they depend on is disabled. The same
gets extended to older kernel versions.

> 
>> Do you have a requirement that livepatch test has to be revision
>> matched with the kernel? Even if that is the case, there is no real
>> reason to move modules under selftests other than keeping them in
>> one location.
> 
> Yes, kABI is not backward compatible. But building the tests
> modules out-of-tree way would allow to build test modules with
> different kABI from the same sources.
> 

Okay. This is a solid reason for livepatch modules to live under
sefltests. Let's capture this in README and the other updates that
need to be made to it in v3.

thanks,
-- Shuah

Shuah Khan Dec. 2, 2022, 8:17 p.m. UTC | #13

On 12/2/22 00:33, Miroslav Benes wrote:
> On Thu, 1 Dec 2022, Shuah Khan wrote:
> 
>> On 11/30/22 15:22, Joe Lawrence wrote:
>>> On 7/15/22 10:45 AM, Petr Mladek wrote:
>>>> On Fri 2022-07-01 16:13:50, Shuah Khan wrote:
>>>>> On 7/1/22 1:48 AM, Miroslav Benes wrote:
>>>>>> On Thu, 30 Jun 2022, Shuah Khan wrote:
>>>>>>>
>>>>>>> Sorry Nack on this. Let's not add modules under selftests. Any usage of
>>>>>>> module_init()
>>>>>>> doesn't belong under selftests.
>>>>>
>>>>> Yes I did and after reviewing and thinking about it some more, I decided
>>>>> this
>>>>> is the right direction go down on.
>>>>
>>>> Do you have some particular reason why building modules in selftests
>>>> directory might cause problems, please?
>>>>
>>
>> My reasons are that with this change module_init() propagates out of
>> strictly kernel space and now is in selftests which are user-space.
>> Any changes to this interface will be tied to user-space change.
> 
> I do not understand this (have not had a cup of tea yet).
> 
> module_init() is defined in include/linux/module.h. It is API. Every
> kernel module, in-tree or out-of-tree, uses it. There is only one usage of
> module_init() in Marcos's patch. In a kernel module, in tools/
> subdirectory yes.
> 
> If there is a change to module_init, it might need editing all the call
> sites, yes. That is inherent.
>   
>> This is my main concern. That is reason why I still ask the question
>> about why is it necessary to make this change other than self-contained
>> sources?
> 
> I will quote myself from an earlier email in the thread which you have not
> replied to...
> 
> "
> My main question is different though. As Marcos mentioned before, we would
> like to have our tests really flexible and a possibility to prepare and
> load different live patch modules based on a template is a part of it.
> What is your proposal regarding this? I can imagine having a template in
> lib/livepatch/ which would not be compilable and a script in
> tools/testing/selftests/livepatch/ would copy it many times, amend the
> copies (meaning parameters would be filled in with sed or the code would
> be changed), compile them and load them. But this sounds horrible to me,
> especially when compared to Marcos' approach in this patch set which is
> quite straightforward.
> "
> 

Responded to Petr's message -

"Yes, kABI is not backward compatible. But building the tests
modules out-of-tree way would allow to build test modules with
different kABI from the same sources."

This is a solid reason for livepatch modules to live under
sefltests. Let's capture this in README and the other updates that
need to be made to it in v3.

thanks,
-- Shuah

Joe Lawrence Dec. 2, 2022, 9:05 p.m. UTC | #14

On 12/2/22 3:03 PM, Shuah Khan wrote:
> On 12/2/22 02:25, Petr Mladek wrote:
>>
>> Yes, kABI is not backward compatible. But building the tests
>> modules out-of-tree way would allow to build test modules with
>> different kABI from the same sources.
>>
> 
> Okay. This is a solid reason for livepatch modules to live under
> sefltests. Let's capture this in README and the other updates that
> need to be made to it in v3.
> 

One additional benefit, however small, is that I think everyone is
building production livepatches, source based or via kpatch-build, as
out-of-tree modules.  (Miroslav/Petr/Marcos please correct me if I'm
wrong about source based.)

Having the livepatch selftests live under selftests/ would imply that
new subsystem (build) features would have to support the production
build use case from the get go.

Regards,

Marcos Paulo de Souza Dec. 5, 2022, 5:30 p.m. UTC | #15

On Fri, Dec 02, 2022 at 04:05:25PM -0500, Joe Lawrence wrote:
> On 12/2/22 3:03 PM, Shuah Khan wrote:
> > On 12/2/22 02:25, Petr Mladek wrote:
> >>
> >> Yes, kABI is not backward compatible. But building the tests
> >> modules out-of-tree way would allow to build test modules with
> >> different kABI from the same sources.
> >>
> > 
> > Okay. This is a solid reason for livepatch modules to live under
> > sefltests. Let's capture this in README and the other updates that
> > need to be made to it in v3.
> > 
> 
> One additional benefit, however small, is that I think everyone is
> building production livepatches, source based or via kpatch-build, as
> out-of-tree modules.  (Miroslav/Petr/Marcos please correct me if I'm
> wrong about source based.)

You're correct. We construct our livepatches source based and build them
as out-of-tree modules.

> 
> Having the livepatch selftests live under selftests/ would imply that
> new subsystem (build) features would have to support the production
> build use case from the get go.

Agreed. This change makes the modules to be built as real world livepatches.

> 
> Regards,
> 
> -- 
> Joe
>

[v2,0/2] livepatch: Move tests from lib/livepatch to selftests/livepatch

Message

Comments