mbox series

[v2,bpf-next,0/7] Support kernel module ksym variables

Message ID 20210108220930.482456-1-andrii@kernel.org
Headers show
Series Support kernel module ksym variables | expand

Message

Andrii Nakryiko Jan. 8, 2021, 10:09 p.m. UTC
Add support for using kernel module global variables (__ksym externs in BPF
program). BPF verifier will now support ldimm64 with src_reg=BPF_PSEUDO_BTF_ID
and non-zero insn[1].imm field, specifying module BTF's FD. In such case,
module BTF object, similarly to BPF maps referenced from ldimm64 with
src_reg=BPF_PSEUDO_MAP_FD, will be recorded in bpf_progran's auxiliary data
and refcnt will be increased for both BTF object itself and its kernel module.
This makes sure kernel module won't be unloaded from under active attached BPF
program. These refcounts will be dropped when BPF program is unloaded.

New selftest validates all this is working as intended. bpf_testmod.ko is
extended with per-CPU variable. Selftests expects the latest pahole changes
(soon to be released as v1.20) to generate per-CPU variable BTF info for
kernel module.

v1->v2:
  - fixed few compiler warnings, posted as separate pre-patches;
rfc->v1:
  - use sys_membarrier(MEMBARRIER_CMD_GLOBAL) (Alexei).

Cc: Hao Luo <haoluo@google.com>
Andrii Nakryiko (7):
  bpf: add bpf_patch_call_args prototype to include/linux/bpf.h
  bpf: avoid warning when re-casting __bpf_call_base into
    __bpf_call_base_args
  bpf: declare __bpf_free_used_maps() unconditionally
  selftests/bpf: sync RCU before unloading bpf_testmod
  bpf: support BPF ksym variables in kernel modules
  libbpf: support kernel module ksym externs
  selftests/bpf: test kernel module ksym externs

 include/linux/bpf.h                           |  18 ++-
 include/linux/bpf_verifier.h                  |   3 +
 include/linux/btf.h                           |   3 +
 include/linux/filter.h                        |   2 +-
 kernel/bpf/btf.c                              |  31 +++-
 kernel/bpf/core.c                             |  23 +++
 kernel/bpf/verifier.c                         | 149 ++++++++++++++----
 tools/lib/bpf/libbpf.c                        |  47 ++++--
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   3 +
 .../selftests/bpf/prog_tests/btf_map_in_map.c |  33 ----
 .../selftests/bpf/prog_tests/ksyms_module.c   |  33 ++++
 .../selftests/bpf/progs/test_ksyms_module.c   |  26 +++
 tools/testing/selftests/bpf/test_progs.c      |  11 ++
 tools/testing/selftests/bpf/test_progs.h      |   1 +
 14 files changed, 300 insertions(+), 83 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c

Comments

Yonghong Song Jan. 11, 2021, 4:02 a.m. UTC | #1
On 1/8/21 2:09 PM, Andrii Nakryiko wrote:
> Add bpf_patch_call_args() prototype. This function is called from BPF verifier

> and only if CONFIG_BPF_JIT_ALWAYS_ON is not defined. This fixes compiler

> warning about missing prototype in some kernel configurations.

> 

> Reported-by: kernel test robot <lkp@intel.com>

> Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")

> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>


Acked-by: Yonghong Song <yhs@fb.com>
Yonghong Song Jan. 11, 2021, 4:03 a.m. UTC | #2
On 1/8/21 2:09 PM, Andrii Nakryiko wrote:
> BPF interpreter uses extra input argument, so re-casts __bpf_call_base into

> __bpf_call_base_args. Avoid compiler warning about incompatible function

> prototypes by casting to void * first.

> 

> Reported-by: kernel test robot <lkp@intel.com>

> Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")

> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>


Acked-by: Yonghong Song <yhs@fb.com>
Yonghong Song Jan. 11, 2021, 4:13 a.m. UTC | #3
On 1/8/21 2:09 PM, Andrii Nakryiko wrote:
> Add support for directly accessing kernel module variables from BPF programs

> using special ldimm64 instructions. This functionality builds upon vmlinux

> ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow

> specifying kernel module BTF's FD in insn[1].imm field.

> 

> During BPF program load time, verifier will resolve FD to BTF object and will

> take reference on BTF object itself and, for module BTFs, corresponding module

> as well, to make sure it won't be unloaded from under running BPF program. The

> mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

> 

> One interesting change is also in how per-CPU variable is determined. The

> logic is to find .data..percpu data section in provided BTF, but both vmlinux

> and module each have their own .data..percpu entries in BTF. So for module's

> case, the search for DATASEC record needs to look at only module's added BTF

> types. This is implemented with custom search function.

> 

> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>


Ack with a minor nit below.

Acked-by: Yonghong Song <yhs@fb.com>


> ---

>   include/linux/bpf.h          |  10 +++

>   include/linux/bpf_verifier.h |   3 +

>   include/linux/btf.h          |   3 +

>   kernel/bpf/btf.c             |  31 +++++++-

>   kernel/bpf/core.c            |  23 ++++++

>   kernel/bpf/verifier.c        | 149 ++++++++++++++++++++++++++++-------

>   6 files changed, 189 insertions(+), 30 deletions(-)

> 

[...]
>   /* replace pseudo btf_id with kernel symbol address */

>   static int check_pseudo_btf_id(struct bpf_verifier_env *env,

>   			       struct bpf_insn *insn,

> @@ -9710,48 +9735,57 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,

>   {

>   	const struct btf_var_secinfo *vsi;

>   	const struct btf_type *datasec;

> +	struct btf_mod_pair *btf_mod;

>   	const struct btf_type *t;

>   	const char *sym_name;

>   	bool percpu = false;

>   	u32 type, id = insn->imm;

> +	struct btf *btf;

>   	s32 datasec_id;

>   	u64 addr;

> -	int i;

> +	int i, btf_fd, err;

>   

> -	if (!btf_vmlinux) {

> -		verbose(env, "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified in Kconfig.\n");

> -		return -EINVAL;

> -	}

> -

> -	if (insn[1].imm != 0) {

> -		verbose(env, "reserved field (insn[1].imm) is used in pseudo_btf_id ldimm64 insn.\n");

> -		return -EINVAL;

> +	btf_fd = insn[1].imm;

> +	if (btf_fd) {

> +		btf = btf_get_by_fd(btf_fd);

> +		if (IS_ERR(btf)) {

> +			verbose(env, "invalid module BTF object FD specified.\n");

> +			return -EINVAL;

> +		}

> +	} else {

> +		if (!btf_vmlinux) {

> +			verbose(env, "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified in Kconfig.\n");

> +			return -EINVAL;

> +		}

> +		btf = btf_vmlinux;

> +		btf_get(btf);

>   	}

>   

> -	t = btf_type_by_id(btf_vmlinux, id);

> +	t = btf_type_by_id(btf, id);

>   	if (!t) {

>   		verbose(env, "ldimm64 insn specifies invalid btf_id %d.\n", id);

> -		return -ENOENT;

> +		err = -ENOENT;

> +		goto err_put;

>   	}

>   

>   	if (!btf_type_is_var(t)) {

> -		verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n",

> -			id);

> -		return -EINVAL;

> +		verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id);

> +		err = -EINVAL;

> +		goto err_put;

>   	}

>   

> -	sym_name = btf_name_by_offset(btf_vmlinux, t->name_off);

> +	sym_name = btf_name_by_offset(btf, t->name_off);

>   	addr = kallsyms_lookup_name(sym_name);

>   	if (!addr) {

>   		verbose(env, "ldimm64 failed to find the address for kernel symbol '%s'.\n",

>   			sym_name);

> -		return -ENOENT;

> +		err = -ENOENT;

> +		goto err_put;

>   	}

>   

> -	datasec_id = btf_find_by_name_kind(btf_vmlinux, ".data..percpu",

> -					   BTF_KIND_DATASEC);

> +	datasec_id = find_btf_percpu_datasec(btf);

>   	if (datasec_id > 0) {

> -		datasec = btf_type_by_id(btf_vmlinux, datasec_id);

> +		datasec = btf_type_by_id(btf, datasec_id);

>   		for_each_vsi(i, datasec, vsi) {

>   			if (vsi->type == id) {

>   				percpu = true;

> @@ -9764,10 +9798,10 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,

>   	insn[1].imm = addr >> 32;

>   

>   	type = t->type;

> -	t = btf_type_skip_modifiers(btf_vmlinux, type, NULL);

> +	t = btf_type_skip_modifiers(btf, type, NULL);

>   	if (percpu) {

>   		aux->btf_var.reg_type = PTR_TO_PERCPU_BTF_ID;

> -		aux->btf_var.btf = btf_vmlinux;

> +		aux->btf_var.btf = btf;

>   		aux->btf_var.btf_id = type;

>   	} else if (!btf_type_is_struct(t)) {

>   		const struct btf_type *ret;

> @@ -9775,21 +9809,54 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,

>   		u32 tsize;

>   

>   		/* resolve the type size of ksym. */

> -		ret = btf_resolve_size(btf_vmlinux, t, &tsize);

> +		ret = btf_resolve_size(btf, t, &tsize);

>   		if (IS_ERR(ret)) {

> -			tname = btf_name_by_offset(btf_vmlinux, t->name_off);

> +			tname = btf_name_by_offset(btf, t->name_off);

>   			verbose(env, "ldimm64 unable to resolve the size of type '%s': %ld\n",

>   				tname, PTR_ERR(ret));

> -			return -EINVAL;

> +			err = -EINVAL;

> +			goto err_put;

>   		}

>   		aux->btf_var.reg_type = PTR_TO_MEM;

>   		aux->btf_var.mem_size = tsize;

>   	} else {

>   		aux->btf_var.reg_type = PTR_TO_BTF_ID;

> -		aux->btf_var.btf = btf_vmlinux;

> +		aux->btf_var.btf = btf;

>   		aux->btf_var.btf_id = type;

>   	}

> +

> +	/* check whether we recorded this BTF (and maybe module) already */

> +	for (i = 0; i < env->used_btf_cnt; i++) {

> +		if (env->used_btfs[i].btf == btf) {

> +			btf_put(btf);

> +			return 0;


An alternative way is to change the above code as
			err = 0;
			goto err_put;

> +		}

> +	}

> +

> +	if (env->used_btf_cnt >= MAX_USED_BTFS) {

> +		err = -E2BIG;

> +		goto err_put;

> +	}

> +

> +	btf_mod = &env->used_btfs[env->used_btf_cnt];

> +	btf_mod->btf = btf;

> +	btf_mod->module = NULL;

> +

> +	/* if we reference variables from kernel module, bump its refcount */

> +	if (btf_is_module(btf)) {

> +		btf_mod->module = btf_try_get_module(btf);

> +		if (!btf_mod->module) {

> +			err = -ENXIO;

> +			goto err_put;

> +		}

> +	}

> +

> +	env->used_btf_cnt++;

> +

>   	return 0;

> +err_put:

> +	btf_put(btf);

> +	return err;

>   }

>   

[...]
Yonghong Song Jan. 11, 2021, 4:18 a.m. UTC | #4
On 1/8/21 2:09 PM, Andrii Nakryiko wrote:
> Add per-CPU variable to bpf_testmod.ko and use those from new selftest to

> validate it works end-to-end.

> 

> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>


Ack with a nit below.

Acked-by: Yonghong Song <yhs@fb.com>


> ---

>   .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  3 ++

>   .../selftests/bpf/prog_tests/ksyms_module.c   | 33 +++++++++++++++++++

>   .../selftests/bpf/progs/test_ksyms_module.c   | 26 +++++++++++++++

>   3 files changed, 62 insertions(+)

>   create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c

>   create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c

> 

> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> index 2df19d73ca49..0b991e115d1f 100644

> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> @@ -3,6 +3,7 @@

>   #include <linux/error-injection.h>

>   #include <linux/init.h>

>   #include <linux/module.h>

> +#include <linux/percpu-defs.h>

>   #include <linux/sysfs.h>

>   #include <linux/tracepoint.h>

>   #include "bpf_testmod.h"

> @@ -10,6 +11,8 @@

>   #define CREATE_TRACE_POINTS

>   #include "bpf_testmod-events.h"

>   

> +DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123;

> +

>   noinline ssize_t

>   bpf_testmod_test_read(struct file *file, struct kobject *kobj,

>   		      struct bin_attribute *bin_attr,

> diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> new file mode 100644

> index 000000000000..7fa3d8b6ca30

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> @@ -0,0 +1,33 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +

> +#include <test_progs.h>

> +#include <bpf/libbpf.h>

> +#include <bpf/btf.h>

> +#include "test_ksyms_module.skel.h"

> +

> +static int duration;

> +

> +void test_ksyms_module(void)

> +{

> +	struct test_ksyms_module* skel;

> +	struct test_ksyms_module__bss *bss;

> +	int err;

> +

> +	skel = test_ksyms_module__open_and_load();

> +	if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))

> +		return;

> +	bss = skel->bss;

> +

> +	err = test_ksyms_module__attach(skel);

> +	if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))

> +		goto cleanup;

> +

> +	usleep(1);


The above bss = skel->bss might be moved here for better readability.
Or better, you can remove definition bss and just use skel->bss
in below two ASSERT_EQs.

> +	ASSERT_EQ(bss->triggered, true, "triggered");

> +	ASSERT_EQ(bss->out_mod_ksym_global, 123, "global_ksym_val");

> +

> +cleanup:

> +	test_ksyms_module__destroy(skel);

> +}

> diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_module.c b/tools/testing/selftests/bpf/progs/test_ksyms_module.c

> new file mode 100644

> index 000000000000..d6a0b3086b90

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/progs/test_ksyms_module.c

> @@ -0,0 +1,26 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +

> +#include "vmlinux.h"

> +

> +#include <bpf/bpf_helpers.h>

> +

> +extern const int bpf_testmod_ksym_percpu __ksym;

> +

> +int out_mod_ksym_global = 0;

> +bool triggered = false;

> +

> +SEC("raw_tp/sys_enter")

> +int handler(const void *ctx)

> +{

> +	int *val;

> +	__u32 cpu;

> +

> +	val = (int *)bpf_this_cpu_ptr(&bpf_testmod_ksym_percpu);

> +	out_mod_ksym_global = *val;

> +	triggered = true;

> +

> +	return 0;

> +}

> +

> +char LICENSE[] SEC("license") = "GPL";

>
Hao Luo Jan. 11, 2021, 6:59 p.m. UTC | #5
Acked-by: Hao Luo <haoluo@google.com>, with a suggestion on adding a comment.


On Fri, Jan 8, 2021 at 2:09 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>

> Add support for directly accessing kernel module variables from BPF programs

> using special ldimm64 instructions. This functionality builds upon vmlinux

> ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow

> specifying kernel module BTF's FD in insn[1].imm field.

>

> During BPF program load time, verifier will resolve FD to BTF object and will

> take reference on BTF object itself and, for module BTFs, corresponding module

> as well, to make sure it won't be unloaded from under running BPF program. The

> mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

>

> One interesting change is also in how per-CPU variable is determined. The

> logic is to find .data..percpu data section in provided BTF, but both vmlinux

> and module each have their own .data..percpu entries in BTF. So for module's

> case, the search for DATASEC record needs to look at only module's added BTF

> types. This is implemented with custom search function.

>

> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

> ---

>  include/linux/bpf.h          |  10 +++

>  include/linux/bpf_verifier.h |   3 +

>  include/linux/btf.h          |   3 +

>  kernel/bpf/btf.c             |  31 +++++++-

>  kernel/bpf/core.c            |  23 ++++++

>  kernel/bpf/verifier.c        | 149 ++++++++++++++++++++++++++++-------

>  6 files changed, 189 insertions(+), 30 deletions(-)


[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c

> index 17270b8404f1..af94c6871ab8 100644

> --- a/kernel/bpf/verifier.c

> +++ b/kernel/bpf/verifier.c

> @@ -9703,6 +9703,31 @@ static int do_check(struct bpf_verifier_env *env)

>         return 0;

>  }

>

> +static int find_btf_percpu_datasec(struct btf *btf)

> +{

> +       const struct btf_type *t;

> +       const char *tname;

> +       int i, n;

> +


It would be good to add a short comment here explaining the reason why
the search for DATASEC in the module case needs to skip entries.

> +       n = btf_nr_types(btf);

> +       if (btf_is_module(btf))

> +               i = btf_nr_types(btf_vmlinux);

> +       else

> +               i = 1;

> +

> +       for(; i < n; i++) {

> +               t = btf_type_by_id(btf, i);

> +               if (BTF_INFO_KIND(t->info) != BTF_KIND_DATASEC)

> +                       continue;

> +

> +               tname = btf_name_by_offset(btf, t->name_off);

> +               if (!strcmp(tname, ".data..percpu"))

> +                       return i;

> +       }

> +

> +       return -ENOENT;

> +}

[...]
> 2.24.1

>
Hao Luo Jan. 11, 2021, 7 p.m. UTC | #6
Acked-by: Hao Luo <haoluo@google.com>



On Fri, Jan 8, 2021 at 2:09 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>

> Add per-CPU variable to bpf_testmod.ko and use those from new selftest to

> validate it works end-to-end.

>

> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

> ---

>  .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  3 ++

>  .../selftests/bpf/prog_tests/ksyms_module.c   | 33 +++++++++++++++++++

>  .../selftests/bpf/progs/test_ksyms_module.c   | 26 +++++++++++++++

>  3 files changed, 62 insertions(+)

>  create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c

>  create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c

>

> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> index 2df19d73ca49..0b991e115d1f 100644

> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> @@ -3,6 +3,7 @@

>  #include <linux/error-injection.h>

>  #include <linux/init.h>

>  #include <linux/module.h>

> +#include <linux/percpu-defs.h>

>  #include <linux/sysfs.h>

>  #include <linux/tracepoint.h>

>  #include "bpf_testmod.h"

> @@ -10,6 +11,8 @@

>  #define CREATE_TRACE_POINTS

>  #include "bpf_testmod-events.h"

>

> +DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123;

> +

>  noinline ssize_t

>  bpf_testmod_test_read(struct file *file, struct kobject *kobj,

>                       struct bin_attribute *bin_attr,

> diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> new file mode 100644

> index 000000000000..7fa3d8b6ca30

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> @@ -0,0 +1,33 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +

> +#include <test_progs.h>

> +#include <bpf/libbpf.h>

> +#include <bpf/btf.h>

> +#include "test_ksyms_module.skel.h"

> +

> +static int duration;

> +

> +void test_ksyms_module(void)

> +{

> +       struct test_ksyms_module* skel;

> +       struct test_ksyms_module__bss *bss;

> +       int err;

> +

> +       skel = test_ksyms_module__open_and_load();

> +       if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))

> +               return;

> +       bss = skel->bss;

> +

> +       err = test_ksyms_module__attach(skel);

> +       if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))

> +               goto cleanup;

> +

> +       usleep(1);

> +

> +       ASSERT_EQ(bss->triggered, true, "triggered");

> +       ASSERT_EQ(bss->out_mod_ksym_global, 123, "global_ksym_val");

> +

> +cleanup:

> +       test_ksyms_module__destroy(skel);

> +}

> diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_module.c b/tools/testing/selftests/bpf/progs/test_ksyms_module.c

> new file mode 100644

> index 000000000000..d6a0b3086b90

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/progs/test_ksyms_module.c

> @@ -0,0 +1,26 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +

> +#include "vmlinux.h"

> +

> +#include <bpf/bpf_helpers.h>

> +

> +extern const int bpf_testmod_ksym_percpu __ksym;

> +

> +int out_mod_ksym_global = 0;

> +bool triggered = false;

> +

> +SEC("raw_tp/sys_enter")

> +int handler(const void *ctx)

> +{

> +       int *val;

> +       __u32 cpu;

> +

> +       val = (int *)bpf_this_cpu_ptr(&bpf_testmod_ksym_percpu);

> +       out_mod_ksym_global = *val;

> +       triggered = true;

> +

> +       return 0;

> +}

> +

> +char LICENSE[] SEC("license") = "GPL";

> --

> 2.24.1

>
Andrii Nakryiko Jan. 11, 2021, 9:29 p.m. UTC | #7
On Sun, Jan 10, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote:
>

>

>

> On 1/8/21 2:09 PM, Andrii Nakryiko wrote:

> > Add support for directly accessing kernel module variables from BPF programs

> > using special ldimm64 instructions. This functionality builds upon vmlinux

> > ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow

> > specifying kernel module BTF's FD in insn[1].imm field.

> >

> > During BPF program load time, verifier will resolve FD to BTF object and will

> > take reference on BTF object itself and, for module BTFs, corresponding module

> > as well, to make sure it won't be unloaded from under running BPF program. The

> > mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

> >

> > One interesting change is also in how per-CPU variable is determined. The

> > logic is to find .data..percpu data section in provided BTF, but both vmlinux

> > and module each have their own .data..percpu entries in BTF. So for module's

> > case, the search for DATASEC record needs to look at only module's added BTF

> > types. This is implemented with custom search function.

> >

> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

>

> Ack with a minor nit below.

>

> Acked-by: Yonghong Song <yhs@fb.com>

>

> > ---

> >   include/linux/bpf.h          |  10 +++

> >   include/linux/bpf_verifier.h |   3 +

> >   include/linux/btf.h          |   3 +

> >   kernel/bpf/btf.c             |  31 +++++++-

> >   kernel/bpf/core.c            |  23 ++++++

> >   kernel/bpf/verifier.c        | 149 ++++++++++++++++++++++++++++-------

> >   6 files changed, 189 insertions(+), 30 deletions(-)

> >

> [...]

> >   /* replace pseudo btf_id with kernel symbol address */

> >   static int check_pseudo_btf_id(struct bpf_verifier_env *env,

> >                              struct bpf_insn *insn,


[...]

> >       } else {

> >               aux->btf_var.reg_type = PTR_TO_BTF_ID;

> > -             aux->btf_var.btf = btf_vmlinux;

> > +             aux->btf_var.btf = btf;

> >               aux->btf_var.btf_id = type;

> >       }

> > +

> > +     /* check whether we recorded this BTF (and maybe module) already */

> > +     for (i = 0; i < env->used_btf_cnt; i++) {

> > +             if (env->used_btfs[i].btf == btf) {

> > +                     btf_put(btf);

> > +                     return 0;

>

> An alternative way is to change the above code as

>                         err = 0;

>                         goto err_put;


I didn't do it, because it's not really an error case, which err_put
implies. If in the future we'll have some more clean up to do, it
might make sense, I suppose.

>

> > +             }

> > +     }

> > +

> > +     if (env->used_btf_cnt >= MAX_USED_BTFS) {

> > +             err = -E2BIG;

> > +             goto err_put;

> > +     }

> > +

> > +     btf_mod = &env->used_btfs[env->used_btf_cnt];

> > +     btf_mod->btf = btf;

> > +     btf_mod->module = NULL;

> > +

> > +     /* if we reference variables from kernel module, bump its refcount */

> > +     if (btf_is_module(btf)) {

> > +             btf_mod->module = btf_try_get_module(btf);

> > +             if (!btf_mod->module) {

> > +                     err = -ENXIO;

> > +                     goto err_put;

> > +             }

> > +     }

> > +

> > +     env->used_btf_cnt++;

> > +

> >       return 0;

> > +err_put:

> > +     btf_put(btf);

> > +     return err;

> >   }

> >

> [...]
Andrii Nakryiko Jan. 11, 2021, 9:31 p.m. UTC | #8
On Mon, Jan 11, 2021 at 11:00 AM Hao Luo <haoluo@google.com> wrote:
>

> Acked-by: Hao Luo <haoluo@google.com>, with a suggestion on adding a comment.

>


top posting your Ack? :)


> On Fri, Jan 8, 2021 at 2:09 PM Andrii Nakryiko <andrii@kernel.org> wrote:

> >

> > Add support for directly accessing kernel module variables from BPF programs

> > using special ldimm64 instructions. This functionality builds upon vmlinux

> > ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow

> > specifying kernel module BTF's FD in insn[1].imm field.

> >

> > During BPF program load time, verifier will resolve FD to BTF object and will

> > take reference on BTF object itself and, for module BTFs, corresponding module

> > as well, to make sure it won't be unloaded from under running BPF program. The

> > mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

> >

> > One interesting change is also in how per-CPU variable is determined. The

> > logic is to find .data..percpu data section in provided BTF, but both vmlinux

> > and module each have their own .data..percpu entries in BTF. So for module's

> > case, the search for DATASEC record needs to look at only module's added BTF

> > types. This is implemented with custom search function.

> >

> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

> > ---

> >  include/linux/bpf.h          |  10 +++

> >  include/linux/bpf_verifier.h |   3 +

> >  include/linux/btf.h          |   3 +

> >  kernel/bpf/btf.c             |  31 +++++++-

> >  kernel/bpf/core.c            |  23 ++++++

> >  kernel/bpf/verifier.c        | 149 ++++++++++++++++++++++++++++-------

> >  6 files changed, 189 insertions(+), 30 deletions(-)

>

> [...]

>

> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c

> > index 17270b8404f1..af94c6871ab8 100644

> > --- a/kernel/bpf/verifier.c

> > +++ b/kernel/bpf/verifier.c

> > @@ -9703,6 +9703,31 @@ static int do_check(struct bpf_verifier_env *env)

> >         return 0;

> >  }

> >

> > +static int find_btf_percpu_datasec(struct btf *btf)

> > +{

> > +       const struct btf_type *t;

> > +       const char *tname;

> > +       int i, n;

> > +

>

> It would be good to add a short comment here explaining the reason why

> the search for DATASEC in the module case needs to skip entries.


I can copy-paste parts of the commit message with that explanation, if
I'll need another version. If not, I can send a follow-up patch.

>

> > +       n = btf_nr_types(btf);

> > +       if (btf_is_module(btf))

> > +               i = btf_nr_types(btf_vmlinux);

> > +       else

> > +               i = 1;

> > +

> > +       for(; i < n; i++) {

> > +               t = btf_type_by_id(btf, i);

> > +               if (BTF_INFO_KIND(t->info) != BTF_KIND_DATASEC)

> > +                       continue;

> > +

> > +               tname = btf_name_by_offset(btf, t->name_off);

> > +               if (!strcmp(tname, ".data..percpu"))

> > +                       return i;

> > +       }

> > +

> > +       return -ENOENT;

> > +}

> [...]

> > 2.24.1

> >
Andrii Nakryiko Jan. 11, 2021, 9:40 p.m. UTC | #9
On Sun, Jan 10, 2021 at 8:18 PM Yonghong Song <yhs@fb.com> wrote:
>

>

>

> On 1/8/21 2:09 PM, Andrii Nakryiko wrote:

> > Add per-CPU variable to bpf_testmod.ko and use those from new selftest to

> > validate it works end-to-end.

> >

> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

>

> Ack with a nit below.

>

> Acked-by: Yonghong Song <yhs@fb.com>

>

> > ---

> >   .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  3 ++

> >   .../selftests/bpf/prog_tests/ksyms_module.c   | 33 +++++++++++++++++++

> >   .../selftests/bpf/progs/test_ksyms_module.c   | 26 +++++++++++++++

> >   3 files changed, 62 insertions(+)

> >   create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> >   create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c

> >

> > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> > index 2df19d73ca49..0b991e115d1f 100644

> > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

> > @@ -3,6 +3,7 @@

> >   #include <linux/error-injection.h>

> >   #include <linux/init.h>

> >   #include <linux/module.h>

> > +#include <linux/percpu-defs.h>

> >   #include <linux/sysfs.h>

> >   #include <linux/tracepoint.h>

> >   #include "bpf_testmod.h"

> > @@ -10,6 +11,8 @@

> >   #define CREATE_TRACE_POINTS

> >   #include "bpf_testmod-events.h"

> >

> > +DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123;

> > +

> >   noinline ssize_t

> >   bpf_testmod_test_read(struct file *file, struct kobject *kobj,

> >                     struct bin_attribute *bin_attr,

> > diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> > new file mode 100644

> > index 000000000000..7fa3d8b6ca30

> > --- /dev/null

> > +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c

> > @@ -0,0 +1,33 @@

> > +// SPDX-License-Identifier: GPL-2.0

> > +/* Copyright (c) 2021 Facebook */

> > +

> > +#include <test_progs.h>

> > +#include <bpf/libbpf.h>

> > +#include <bpf/btf.h>

> > +#include "test_ksyms_module.skel.h"

> > +

> > +static int duration;

> > +

> > +void test_ksyms_module(void)

> > +{

> > +     struct test_ksyms_module* skel;

> > +     struct test_ksyms_module__bss *bss;

> > +     int err;

> > +

> > +     skel = test_ksyms_module__open_and_load();

> > +     if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))

> > +             return;

> > +     bss = skel->bss;

> > +

> > +     err = test_ksyms_module__attach(skel);

> > +     if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))

> > +             goto cleanup;

> > +

> > +     usleep(1);

>

> The above bss = skel->bss might be moved here for better readability.

> Or better, you can remove definition bss and just use skel->bss

> in below two ASSERT_EQs.


Sure, I'll just inline for such a short test.

>

> > +     ASSERT_EQ(bss->triggered, true, "triggered");

> > +     ASSERT_EQ(bss->out_mod_ksym_global, 123, "global_ksym_val");

> > +

> > +cleanup:

> > +     test_ksyms_module__destroy(skel);

> > +}

> > diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_module.c b/tools/testing/selftests/bpf/progs/test_ksyms_module.c

> > new file mode 100644

> > index 000000000000..d6a0b3086b90

> > --- /dev/null

> > +++ b/tools/testing/selftests/bpf/progs/test_ksyms_module.c

> > @@ -0,0 +1,26 @@

> > +// SPDX-License-Identifier: GPL-2.0

> > +/* Copyright (c) 2021 Facebook */

> > +

> > +#include "vmlinux.h"

> > +

> > +#include <bpf/bpf_helpers.h>

> > +

> > +extern const int bpf_testmod_ksym_percpu __ksym;

> > +

> > +int out_mod_ksym_global = 0;

> > +bool triggered = false;

> > +

> > +SEC("raw_tp/sys_enter")

> > +int handler(const void *ctx)

> > +{

> > +     int *val;

> > +     __u32 cpu;

> > +

> > +     val = (int *)bpf_this_cpu_ptr(&bpf_testmod_ksym_percpu);

> > +     out_mod_ksym_global = *val;

> > +     triggered = true;

> > +

> > +     return 0;

> > +}

> > +

> > +char LICENSE[] SEC("license") = "GPL";

> >
Yonghong Song Jan. 12, 2021, 1:25 a.m. UTC | #10
On 1/11/21 1:29 PM, Andrii Nakryiko wrote:
> On Sun, Jan 10, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote:

>>

>>

>>

>> On 1/8/21 2:09 PM, Andrii Nakryiko wrote:

>>> Add support for directly accessing kernel module variables from BPF programs

>>> using special ldimm64 instructions. This functionality builds upon vmlinux

>>> ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow

>>> specifying kernel module BTF's FD in insn[1].imm field.

>>>

>>> During BPF program load time, verifier will resolve FD to BTF object and will

>>> take reference on BTF object itself and, for module BTFs, corresponding module

>>> as well, to make sure it won't be unloaded from under running BPF program. The

>>> mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

>>>

>>> One interesting change is also in how per-CPU variable is determined. The

>>> logic is to find .data..percpu data section in provided BTF, but both vmlinux

>>> and module each have their own .data..percpu entries in BTF. So for module's

>>> case, the search for DATASEC record needs to look at only module's added BTF

>>> types. This is implemented with custom search function.

>>>

>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

>>

>> Ack with a minor nit below.

>>

>> Acked-by: Yonghong Song <yhs@fb.com>

>>

>>> ---

>>>    include/linux/bpf.h          |  10 +++

>>>    include/linux/bpf_verifier.h |   3 +

>>>    include/linux/btf.h          |   3 +

>>>    kernel/bpf/btf.c             |  31 +++++++-

>>>    kernel/bpf/core.c            |  23 ++++++

>>>    kernel/bpf/verifier.c        | 149 ++++++++++++++++++++++++++++-------

>>>    6 files changed, 189 insertions(+), 30 deletions(-)

>>>

>> [...]

>>>    /* replace pseudo btf_id with kernel symbol address */

>>>    static int check_pseudo_btf_id(struct bpf_verifier_env *env,

>>>                               struct bpf_insn *insn,

> 

> [...]

> 

>>>        } else {

>>>                aux->btf_var.reg_type = PTR_TO_BTF_ID;

>>> -             aux->btf_var.btf = btf_vmlinux;

>>> +             aux->btf_var.btf = btf;

>>>                aux->btf_var.btf_id = type;

>>>        }

>>> +

>>> +     /* check whether we recorded this BTF (and maybe module) already */

>>> +     for (i = 0; i < env->used_btf_cnt; i++) {

>>> +             if (env->used_btfs[i].btf == btf) {

>>> +                     btf_put(btf);

>>> +                     return 0;

>>

>> An alternative way is to change the above code as

>>                          err = 0;

>>                          goto err_put;

> 

> I didn't do it, because it's not really an error case, which err_put

> implies. If in the future we'll have some more clean up to do, it

> might make sense, I suppose.


You can change label err_put to btf_put, so this way, btf_put() will
only show up in one place. But I won't insist on this.

> 

>>

>>> +             }

>>> +     }

>>> +

>>> +     if (env->used_btf_cnt >= MAX_USED_BTFS) {

>>> +             err = -E2BIG;

>>> +             goto err_put;

>>> +     }

>>> +

>>> +     btf_mod = &env->used_btfs[env->used_btf_cnt];

>>> +     btf_mod->btf = btf;

>>> +     btf_mod->module = NULL;

>>> +

>>> +     /* if we reference variables from kernel module, bump its refcount */

>>> +     if (btf_is_module(btf)) {

>>> +             btf_mod->module = btf_try_get_module(btf);

>>> +             if (!btf_mod->module) {

>>> +                     err = -ENXIO;

>>> +                     goto err_put;

>>> +             }

>>> +     }

>>> +

>>> +     env->used_btf_cnt++;

>>> +

>>>        return 0;

>>> +err_put:

>>> +     btf_put(btf);

>>> +     return err;

>>>    }

>>>

>> [...]