[bpf-next,00/15] Support calling kernel function

Message ID	20210316011336.4173585-1-kafai@fb.com
Headers	show Return-Path: <netdev-owner@kernel.org> From: Martin KaFai Lau <kafai@fb.com> To: <bpf@vger.kernel.org> CC: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, <kernel-team@fb.com>, <netdev@vger.kernel.org> Subject: [PATCH bpf-next 00/15] Support calling kernel function Date: Mon, 15 Mar 2021 18:13:36 -0700 Message-ID: <20210316011336.4173585-1-kafai@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain Precedence: bulk
Series	Support calling kernel function \| expand [bpf-next,00/15] Support calling kernel function [bpf-next,02/15] bpf: btf: Support parsing extern func [bpf-next,04/15] bpf: Support bpf program calling kernel function [bpf-next,05/15] bpf: Support kernel function call in x86-32 [bpf-next,08/15] libbpf: Refactor bpf_object__resolve_ksyms_btf_id [bpf-next,09/15] libbpf: Refactor codes for finding btf id of a kernel symbol [bpf-next,12/15] libbpf: Support extern kernel function [bpf-next,14/15] bpf: selftest: bpf_cubic and bpf_dctcp calling kernel functions

Martin KaFai Lau March 16, 2021, 1:13 a.m. UTC

This series adds support to allow bpf program calling kernel function.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

Please see individual patch for details.

Martin KaFai Lau (15):
  bpf: Simplify freeing logic in linfo and jited_linfo
  bpf: btf: Support parsing extern func
  bpf: Refactor btf_check_func_arg_match
  bpf: Support bpf program calling kernel function
  bpf: Support kernel function call in x86-32
  tcp: Rename bictcp function prefix to cubictcp
  bpf: tcp: White list some tcp cong functions to be called by
    bpf-tcp-cc
  libbpf: Refactor bpf_object__resolve_ksyms_btf_id
  libbpf: Refactor codes for finding btf id of a kernel symbol
  libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
  libbpf: Record extern sym relocation first
  libbpf: Support extern kernel function
  bpf: selftests: Rename bictcp to bpf_cubic
  bpf: selftest: bpf_cubic and bpf_dctcp calling kernel functions
  bpf: selftest: Add kfunc_call test

 arch/x86/net/bpf_jit_comp.c                   |   5 +
 arch/x86/net/bpf_jit_comp32.c                 | 198 +++++++++
 include/linux/bpf.h                           |  24 ++
 include/linux/btf.h                           |   6 +
 include/linux/filter.h                        |   4 +-
 include/uapi/linux/bpf.h                      |   4 +
 kernel/bpf/btf.c                              | 270 ++++++++-----
 kernel/bpf/core.c                             |  47 +--
 kernel/bpf/disasm.c                           |  32 +-
 kernel/bpf/disasm.h                           |   3 +-
 kernel/bpf/syscall.c                          |   4 +-
 kernel/bpf/verifier.c                         | 380 ++++++++++++++++--
 net/bpf/test_run.c                            |  11 +
 net/core/filter.c                             |  11 +
 net/ipv4/bpf_tcp_ca.c                         |  41 ++
 net/ipv4/tcp_cubic.c                          |  24 +-
 tools/bpf/bpftool/xlated_dumper.c             |   3 +-
 tools/include/uapi/linux/bpf.h                |   4 +
 tools/lib/bpf/btf.c                           |  32 +-
 tools/lib/bpf/btf.h                           |   5 +
 tools/lib/bpf/libbpf.c                        | 316 ++++++++++-----
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |  29 +-
 tools/testing/selftests/bpf/prog_tests/btf.c  | 154 ++++++-
 .../selftests/bpf/prog_tests/kfunc_call.c     |  61 +++
 tools/testing/selftests/bpf/progs/bpf_cubic.c |  36 +-
 tools/testing/selftests/bpf/progs/bpf_dctcp.c |  22 +-
 .../selftests/bpf/progs/kfunc_call_test.c     |  48 +++
 .../bpf/progs/kfunc_call_test_subprog.c       |  31 ++
 28 files changed, 1454 insertions(+), 351 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/kfunc_call.c
 create mode 100644 tools/testing/selftests/bpf/progs/kfunc_call_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/kfunc_call_test_subprog.c

Andrii Nakryiko March 18, 2021, 11:32 p.m. UTC | #1

On Tue, Mar 16, 2021 at 12:01 AM Martin KaFai Lau <kafai@fb.com> wrote:
>

> This patch refactors the core logic of "btf_check_func_arg_match()"

> into a new function "do_btf_check_func_arg_match()".

> "do_btf_check_func_arg_match()" will be reused later to check

> the kernel function call.

>

> The "if (!btf_type_is_ptr(t))" is checked first to improve the indentation

> which will be useful for a later patch.

>

> Some of the "btf_kind_str[]" usages is replaced with the shortcut

> "btf_type_str(t)".

>

> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> ---

>  include/linux/btf.h |   5 ++

>  kernel/bpf/btf.c    | 159 ++++++++++++++++++++++++--------------------

>  2 files changed, 91 insertions(+), 73 deletions(-)

>

> diff --git a/include/linux/btf.h b/include/linux/btf.h

> index 7fabf1428093..93bf2e5225f5 100644

> --- a/include/linux/btf.h

> +++ b/include/linux/btf.h

> @@ -140,6 +140,11 @@ static inline bool btf_type_is_enum(const struct btf_type *t)

>         return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;

>  }

>

> +static inline bool btf_type_is_scalar(const struct btf_type *t)

> +{

> +       return btf_type_is_int(t) || btf_type_is_enum(t);

> +}

> +

>  static inline bool btf_type_is_typedef(const struct btf_type *t)

>  {

>         return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF;

> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c

> index 96cd24020a38..529b94b601c6 100644

> --- a/kernel/bpf/btf.c

> +++ b/kernel/bpf/btf.c

> @@ -4381,7 +4381,7 @@ static u8 bpf_ctx_convert_map[] = {

>  #undef BPF_LINK_TYPE

>

>  static const struct btf_member *

> -btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,

> +btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,

>                       const struct btf_type *t, enum bpf_prog_type prog_type,

>                       int arg)

>  {

> @@ -5366,122 +5366,135 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr

>         return btf_check_func_type_match(log, btf1, t1, btf2, t2);

>  }

>

> -/* Compare BTF of a function with given bpf_reg_state.

> - * Returns:

> - * EFAULT - there is a verifier bug. Abort verification.

> - * EINVAL - there is a type mismatch or BTF is not available.

> - * 0 - BTF matches with what bpf_reg_state expects.

> - * Only PTR_TO_CTX and SCALAR_VALUE states are recognized.

> - */

> -int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,

> -                            struct bpf_reg_state *regs)

> +static int do_btf_check_func_arg_match(struct bpf_verifier_env *env,


do_btf_check_func_arg_match vs btf_check_func_arg_match distinction is
not clear at all. How about something like

btf_check_func_arg_match vs btf_check_subprog_arg_match (or btf_func
vs bpf_subprog). I think that highlights the main distinction better,
no?

> +                                      const struct btf *btf, u32 func_id,

> +                                      struct bpf_reg_state *regs,

> +                                      bool ptr_to_mem_ok)

>  {

>         struct bpf_verifier_log *log = &env->log;

> -       struct bpf_prog *prog = env->prog;

> -       struct btf *btf = prog->aux->btf;

> -       const struct btf_param *args;

> +       const char *func_name, *ref_tname;

>         const struct btf_type *t, *ref_t;

> -       u32 i, nargs, btf_id, type_size;

> -       const char *tname;

> -       bool is_global;

> -

> -       if (!prog->aux->func_info)

> -               return -EINVAL;

> -

> -       btf_id = prog->aux->func_info[subprog].type_id;

> -       if (!btf_id)

> -               return -EFAULT;

> -

> -       if (prog->aux->func_info_aux[subprog].unreliable)

> -               return -EINVAL;

> +       const struct btf_param *args;

> +       u32 i, nargs;

>

> -       t = btf_type_by_id(btf, btf_id);

> +       t = btf_type_by_id(btf, func_id);

>         if (!t || !btf_type_is_func(t)) {

>                 /* These checks were already done by the verifier while loading

>                  * struct bpf_func_info

>                  */

> -               bpf_log(log, "BTF of func#%d doesn't point to KIND_FUNC\n",

> -                       subprog);

> +               bpf_log(log, "BTF of func_id %u doesn't point to KIND_FUNC\n",

> +                       func_id);

>                 return -EFAULT;

>         }

> -       tname = btf_name_by_offset(btf, t->name_off);

> +       func_name = btf_name_by_offset(btf, t->name_off);

>

>         t = btf_type_by_id(btf, t->type);

>         if (!t || !btf_type_is_func_proto(t)) {

> -               bpf_log(log, "Invalid BTF of func %s\n", tname);

> +               bpf_log(log, "Invalid BTF of func %s\n", func_name);

>                 return -EFAULT;

>         }

>         args = (const struct btf_param *)(t + 1);

>         nargs = btf_type_vlen(t);

>         if (nargs > MAX_BPF_FUNC_REG_ARGS) {

> -               bpf_log(log, "Function %s has %d > %d args\n", tname, nargs,

> +               bpf_log(log, "Function %s has %d > %d args\n", func_name, nargs,

>                         MAX_BPF_FUNC_REG_ARGS);

> -               goto out;

> +               return -EINVAL;

>         }

>

> -       is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;

>         /* check that BTF function arguments match actual types that the

>          * verifier sees.

>          */

>         for (i = 0; i < nargs; i++) {

> -               struct bpf_reg_state *reg = &regs[i + 1];

> +               u32 regno = i + 1;

> +               struct bpf_reg_state *reg = &regs[regno];

>

> -               t = btf_type_by_id(btf, args[i].type);

> -               while (btf_type_is_modifier(t))

> -                       t = btf_type_by_id(btf, t->type);

> -               if (btf_type_is_int(t) || btf_type_is_enum(t)) {

> +               t = btf_type_skip_modifiers(btf, args[i].type, NULL);

> +               if (btf_type_is_scalar(t)) {

>                         if (reg->type == SCALAR_VALUE)

>                                 continue;

> -                       bpf_log(log, "R%d is not a scalar\n", i + 1);

> -                       goto out;

> +                       bpf_log(log, "R%d is not a scalar\n", regno);

> +                       return -EINVAL;

>                 }

> -               if (btf_type_is_ptr(t)) {

> +

> +               if (!btf_type_is_ptr(t)) {

> +                       bpf_log(log, "Unrecognized arg#%d type %s\n",

> +                               i, btf_type_str(t));

> +                       return -EINVAL;

> +               }

> +

> +               ref_t = btf_type_skip_modifiers(btf, t->type, NULL);

> +               ref_tname = btf_name_by_offset(btf, ref_t->name_off);


these two seem to be used only inside else `if (ptr_to_mem_ok)`, let's
move the code and variables inside that branch?

> +               if (btf_get_prog_ctx_type(log, btf, t, env->prog->type, i)) {

>                         /* If function expects ctx type in BTF check that caller

>                          * is passing PTR_TO_CTX.

>                          */

> -                       if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {

> -                               if (reg->type != PTR_TO_CTX) {

> -                                       bpf_log(log,

> -                                               "arg#%d expected pointer to ctx, but got %s\n",

> -                                               i, btf_kind_str[BTF_INFO_KIND(t->info)]);

> -                                       goto out;

> -                               }

> -                               if (check_ctx_reg(env, reg, i + 1))

> -                                       goto out;

> -                               continue;

> +                       if (reg->type != PTR_TO_CTX) {

> +                               bpf_log(log,

> +                                       "arg#%d expected pointer to ctx, but got %s\n",

> +                                       i, btf_type_str(t));

> +                               return -EINVAL;

>                         }

> +                       if (check_ctx_reg(env, reg, regno))

> +                               return -EINVAL;


original code had `continue` here allowing to stop tracking if/else
logic. Any specific reason you removed it? It keeps logic simpler to
follow, imo.

> +               } else if (ptr_to_mem_ok) {


similarly to how you did reduction of nestedness with btf_type_is_ptr, I'd do

if (!ptr_to_mem_ok)
    return -EINVAL;

and let brain forget about another if/else branch tracking

> +                       const struct btf_type *resolve_ret;

> +                       u32 type_size;

>

> -                       if (!is_global)

> -                               goto out;

> -

> -                       t = btf_type_skip_modifiers(btf, t->type, NULL);

> -

> -                       ref_t = btf_resolve_size(btf, t, &type_size);

> -                       if (IS_ERR(ref_t)) {

> +                       resolve_ret = btf_resolve_size(btf, ref_t, &type_size);

> +                       if (IS_ERR(resolve_ret)) {

>                                 bpf_log(log,

> -                                   "arg#%d reference type('%s %s') size cannot be determined: %ld\n",

> -                                   i, btf_type_str(t), btf_name_by_offset(btf, t->name_off),

> -                                       PTR_ERR(ref_t));

> -                               goto out;

> +                                       "arg#%d reference type('%s %s') size cannot be determined: %ld\n",

> +                                       i, btf_type_str(ref_t), ref_tname,

> +                                       PTR_ERR(resolve_ret));

> +                               return -EINVAL;

>                         }

>

> -                       if (check_mem_reg(env, reg, i + 1, type_size))

> -                               goto out;

> -

> -                       continue;

> +                       if (check_mem_reg(env, reg, regno, type_size))

> +                               return -EINVAL;

> +               } else {

> +                       return -EINVAL;

>                 }

> -               bpf_log(log, "Unrecognized arg#%d type %s\n",

> -                       i, btf_kind_str[BTF_INFO_KIND(t->info)]);

> -               goto out;

>         }

> +

>         return 0;

> -out:

> +}

> +

> +/* Compare BTF of a function with given bpf_reg_state.

> + * Returns:

> + * EFAULT - there is a verifier bug. Abort verification.

> + * EINVAL - there is a type mismatch or BTF is not available.

> + * 0 - BTF matches with what bpf_reg_state expects.

> + * Only PTR_TO_CTX and SCALAR_VALUE states are recognized.

> + */

> +int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,

> +                            struct bpf_reg_state *regs)

> +{

> +       struct bpf_prog *prog = env->prog;

> +       struct btf *btf = prog->aux->btf;

> +       bool is_global;

> +       u32 btf_id;

> +       int err;

> +

> +       if (!prog->aux->func_info)

> +               return -EINVAL;

> +

> +       btf_id = prog->aux->func_info[subprog].type_id;

> +       if (!btf_id)

> +               return -EFAULT;

> +

> +       if (prog->aux->func_info_aux[subprog].unreliable)

> +               return -EINVAL;

> +

> +       is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;

> +       err = do_btf_check_func_arg_match(env, btf, btf_id, regs, is_global);

> +

>         /* Compiler optimizations can remove arguments from static functions

>          * or mismatched type can be passed into a global function.

>          * In such cases mark the function as unreliable from BTF point of view.

>          */

> -       prog->aux->func_info_aux[subprog].unreliable = true;

> -       return -EINVAL;

> +       if (err == -EINVAL)

> +               prog->aux->func_info_aux[subprog].unreliable = true;


is there any harm marking it unreliable for any error? this makes it
look like -EINVAL is super-special. If it's EFAULT, it won't matter,
right?

> +       return err;

>  }

>

>  /* Convert BTF of a function into bpf_reg_state if possible

> --

> 2.30.2

>

Andrii Nakryiko March 19, 2021, 1:19 a.m. UTC | #2

On Tue, Mar 16, 2021 at 12:02 AM Martin KaFai Lau <kafai@fb.com> wrote:
>

> This patch white list some tcp cong helper functions, tcp_slow_start()

> and tcp_cong_avoid_ai().  They are allowed to be directly called by

> the bpf-tcp-cc program.

>

> A few tcp cc implementation functions are also white listed.

> A potential use case is the bpf-tcp-cc implementation may only

> want to override a subset of a tcp_congestion_ops.  For others,

> the bpf-tcp-cc can directly call the kernel counter parts instead of

> re-implementing (or copy-and-pasting) them to the bpf program.

>

> They will only be available to the bpf-tcp-cc typed program.

> The white listed functions are not bounded to a fixed ABI contract.

> When any of them has changed, the bpf-tcp-cc program has to be changed

> like any in-tree/out-of-tree kernel tcp-cc implementations do also.

>

> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> ---


Just nits, of course :)

Whitelist is a single word, but see also 49decddd39e5 ("Merge tag
'inclusive-terminology' of
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux"),
allowlist/denylist is recommended for new code.

Acked-by: Andrii Nakryiko <andrii@kernel.org>


>  net/ipv4/bpf_tcp_ca.c | 41 +++++++++++++++++++++++++++++++++++++++++

>  1 file changed, 41 insertions(+)

>

> diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c

> index d520e61649c8..ed6e6b5b762b 100644

> --- a/net/ipv4/bpf_tcp_ca.c

> +++ b/net/ipv4/bpf_tcp_ca.c

> @@ -5,6 +5,7 @@

>  #include <linux/bpf_verifier.h>

>  #include <linux/bpf.h>

>  #include <linux/btf.h>

> +#include <linux/btf_ids.h>

>  #include <linux/filter.h>

>  #include <net/tcp.h>

>  #include <net/bpf_sk_storage.h>

> @@ -178,10 +179,50 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id,

>         }

>  }

>

> +BTF_SET_START(bpf_tcp_ca_kfunc_ids)

> +BTF_ID(func, tcp_reno_ssthresh)

> +BTF_ID(func, tcp_reno_cong_avoid)

> +BTF_ID(func, tcp_reno_undo_cwnd)

> +BTF_ID(func, tcp_slow_start)

> +BTF_ID(func, tcp_cong_avoid_ai)

> +#if IS_BUILTIN(CONFIG_TCP_CONG_CUBIC)

> +BTF_ID(func, cubictcp_init)

> +BTF_ID(func, cubictcp_recalc_ssthresh)

> +BTF_ID(func, cubictcp_cong_avoid)

> +BTF_ID(func, cubictcp_state)

> +BTF_ID(func, cubictcp_cwnd_event)

> +BTF_ID(func, cubictcp_acked)

> +#endif

> +#if IS_BUILTIN(CONFIG_TCP_CONG_DCTCP)

> +BTF_ID(func, dctcp_init)

> +BTF_ID(func, dctcp_update_alpha)

> +BTF_ID(func, dctcp_cwnd_event)

> +BTF_ID(func, dctcp_ssthresh)

> +BTF_ID(func, dctcp_cwnd_undo)

> +BTF_ID(func, dctcp_state)

> +#endif

> +#if IS_BUILTIN(CONFIG_TCP_CONG_BBR)

> +BTF_ID(func, bbr_init)

> +BTF_ID(func, bbr_main)

> +BTF_ID(func, bbr_sndbuf_expand)

> +BTF_ID(func, bbr_undo_cwnd)

> +BTF_ID(func, bbr_cwnd_even),

> +BTF_ID(func, bbr_ssthresh)

> +BTF_ID(func, bbr_min_tso_segs)

> +BTF_ID(func, bbr_set_state)

> +#endif

> +BTF_SET_END(bpf_tcp_ca_kfunc_ids)


see, kfunc here...

> +

> +static bool bpf_tcp_ca_check_kern_func_call(u32 kfunc_btf_id)


...but more verbose kern_func here. I like kfunc everywhere ;)

> +{

> +       return btf_id_set_contains(&bpf_tcp_ca_kfunc_ids, kfunc_btf_id);

> +}

> +

>  static const struct bpf_verifier_ops bpf_tcp_ca_verifier_ops = {

>         .get_func_proto         = bpf_tcp_ca_get_func_proto,

>         .is_valid_access        = bpf_tcp_ca_is_valid_access,

>         .btf_struct_access      = bpf_tcp_ca_btf_struct_access,

> +       .check_kern_func_call   = bpf_tcp_ca_check_kern_func_call,

>  };

>

>  static int bpf_tcp_ca_init_member(const struct btf_type *t,

> --

> 2.30.2

>

Andrii Nakryiko March 19, 2021, 3:15 a.m. UTC | #3

On Tue, Mar 16, 2021 at 12:02 AM Martin KaFai Lau <kafai@fb.com> wrote:
>

> This patch renames RELO_EXTERN to RELO_EXTERN_VAR.

> It is to avoid the confusion with a later patch adding

> RELO_EXTERN_FUNC.

>

> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> ---


Acked-by: Andrii Nakryiko <andrii@kernel.org>


>  tools/lib/bpf/libbpf.c | 6 +++---

>  1 file changed, 3 insertions(+), 3 deletions(-)

>

> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c

> index 8355b786b3db..8f924aece736 100644

> --- a/tools/lib/bpf/libbpf.c

> +++ b/tools/lib/bpf/libbpf.c

> @@ -189,7 +189,7 @@ enum reloc_type {

>         RELO_LD64,

>         RELO_CALL,

>         RELO_DATA,

> -       RELO_EXTERN,

> +       RELO_EXTERN_VAR,

>         RELO_SUBPROG_ADDR,

>  };

>

> @@ -3463,7 +3463,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog,

>                 }

>                 pr_debug("prog '%s': found extern #%d '%s' (sym %d) for insn #%u\n",

>                          prog->name, i, ext->name, ext->sym_idx, insn_idx);

> -               reloc_desc->type = RELO_EXTERN;

> +               reloc_desc->type = RELO_EXTERN_VAR;

>                 reloc_desc->insn_idx = insn_idx;

>                 reloc_desc->sym_off = i; /* sym_off stores extern index */

>                 return 0;

> @@ -6226,7 +6226,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)

>                         insn[0].imm = obj->maps[relo->map_idx].fd;

>                         relo->processed = true;

>                         break;

> -               case RELO_EXTERN:

> +               case RELO_EXTERN_VAR:

>                         ext = &obj->externs[relo->sym_off];

>                         if (ext->type == EXT_KCFG) {

>                                 insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;

> --

> 2.30.2

>

Andrii Nakryiko March 19, 2021, 3:16 a.m. UTC | #4

On Tue, Mar 16, 2021 at 12:02 AM Martin KaFai Lau <kafai@fb.com> wrote:
>

> This patch records the extern sym relocs first before recording

> subprog relocs.  The later patch will have relocs for extern

> kernel function call which is also using BPF_JMP | BPF_CALL.

> It will be easier to handle the extern symbols first in

> the later patch.

>

> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> ---


Looks good, just let's add that tiny helper for cleanliness and to
match what we do for ldimm64

Acked-by: Andrii Nakryiko <andrii@kernel.org>


>  tools/lib/bpf/libbpf.c | 50 +++++++++++++++++++++---------------------

>  1 file changed, 25 insertions(+), 25 deletions(-)

>

> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c

> index 8f924aece736..0a60fcb2fba2 100644

> --- a/tools/lib/bpf/libbpf.c

> +++ b/tools/lib/bpf/libbpf.c

> @@ -3416,31 +3416,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog,

>

>         reloc_desc->processed = false;

>

> -       /* sub-program call relocation */

> -       if (insn->code == (BPF_JMP | BPF_CALL)) {

> -               if (insn->src_reg != BPF_PSEUDO_CALL) {

> -                       pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name);

> -                       return -LIBBPF_ERRNO__RELOC;

> -               }

> -               /* text_shndx can be 0, if no default "main" program exists */

> -               if (!shdr_idx || shdr_idx != obj->efile.text_shndx) {

> -                       sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx));

> -                       pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n",

> -                               prog->name, sym_name, sym_sec_name);

> -                       return -LIBBPF_ERRNO__RELOC;

> -               }

> -               if (sym->st_value % BPF_INSN_SZ) {

> -                       pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n",

> -                               prog->name, sym_name, (size_t)sym->st_value);

> -                       return -LIBBPF_ERRNO__RELOC;

> -               }

> -               reloc_desc->type = RELO_CALL;

> -               reloc_desc->insn_idx = insn_idx;

> -               reloc_desc->sym_off = sym->st_value;

> -               return 0;

> -       }

> -

> -       if (!is_ldimm64(insn)) {

> +       if (insn->code != (BPF_JMP | BPF_CALL) && !is_ldimm64(insn)) {

>                 pr_warn("prog '%s': invalid relo against '%s' for insns[%d].code 0x%x\n",

>                         prog->name, sym_name, insn_idx, insn->code);

>                 return -LIBBPF_ERRNO__RELOC;

> @@ -3469,6 +3445,30 @@ static int bpf_program__record_reloc(struct bpf_program *prog,

>                 return 0;

>         }

>

> +       /* sub-program call relocation */

> +       if (insn->code == (BPF_JMP | BPF_CALL)) {


can you please add is_call_insn() helper checking this, similarly to
how we now have is_ldimm64() (should probably be is_ldimm64_insn() for
consistency)

> +               if (insn->src_reg != BPF_PSEUDO_CALL) {

> +                       pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name);

> +                       return -LIBBPF_ERRNO__RELOC;

> +               }

> +               /* text_shndx can be 0, if no default "main" program exists */

> +               if (!shdr_idx || shdr_idx != obj->efile.text_shndx) {

> +                       sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx));

> +                       pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n",

> +                               prog->name, sym_name, sym_sec_name);

> +                       return -LIBBPF_ERRNO__RELOC;

> +               }

> +               if (sym->st_value % BPF_INSN_SZ) {

> +                       pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n",

> +                               prog->name, sym_name, (size_t)sym->st_value);

> +                       return -LIBBPF_ERRNO__RELOC;

> +               }

> +               reloc_desc->type = RELO_CALL;

> +               reloc_desc->insn_idx = insn_idx;

> +               reloc_desc->sym_off = sym->st_value;

> +               return 0;

> +       }

> +

>         if (!shdr_idx || shdr_idx >= SHN_LORESERVE) {

>                 pr_warn("prog '%s': invalid relo against '%s' in special section 0x%x; forgot to initialize global var?..\n",

>                         prog->name, sym_name, shdr_idx);

> --

> 2.30.2

>

Andrii Nakryiko March 19, 2021, 4:14 a.m. UTC | #5

On Tue, Mar 16, 2021 at 12:02 AM Martin KaFai Lau <kafai@fb.com> wrote:
>

> As a similar chanage in the kernel, this patch gives the proper

> name to the bpf cubic.

>

> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> ---


Acked-by: Andrii Nakryiko <andrii@kernel.org>


>  tools/testing/selftests/bpf/progs/bpf_cubic.c | 30 +++++++++----------

>  1 file changed, 15 insertions(+), 15 deletions(-)

>


[...]

Andrii Nakryiko March 19, 2021, 4:21 a.m. UTC | #6

On Tue, Mar 16, 2021 at 12:02 AM Martin KaFai Lau <kafai@fb.com> wrote:
>

> This patch adds two kernel function bpf_kfunc_call_test[12]() for the

> selftest's test_run purpose.  They will be allowed for tc_cls prog.

>

> The selftest calling the kernel function bpf_kfunc_call_test[12]()

> is also added in this patch.

>

> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> ---

>  net/bpf/test_run.c                            | 11 ++++

>  net/core/filter.c                             | 11 ++++

>  .../selftests/bpf/prog_tests/kfunc_call.c     | 61 +++++++++++++++++++

>  .../selftests/bpf/progs/kfunc_call_test.c     | 48 +++++++++++++++

>  .../bpf/progs/kfunc_call_test_subprog.c       | 31 ++++++++++

>  5 files changed, 162 insertions(+)

>  create mode 100644 tools/testing/selftests/bpf/prog_tests/kfunc_call.c

>  create mode 100644 tools/testing/selftests/bpf/progs/kfunc_call_test.c

>  create mode 100644 tools/testing/selftests/bpf/progs/kfunc_call_test_subprog.c

>

> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c

> index 0abdd67f44b1..c1baab0c7d96 100644

> --- a/net/bpf/test_run.c

> +++ b/net/bpf/test_run.c

> @@ -209,6 +209,17 @@ int noinline bpf_modify_return_test(int a, int *b)

>         *b += 1;

>         return a + *b;

>  }

> +

> +u64 noinline bpf_kfunc_call_test1(struct sock *sk, u32 a, u64 b, u32 c, u64 d)

> +{

> +       return a + b + c + d;

> +}

> +

> +int noinline bpf_kfunc_call_test2(struct sock *sk, u32 a, u32 b)

> +{

> +       return a + b;

> +}

> +

>  __diag_pop();

>

>  ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);

> diff --git a/net/core/filter.c b/net/core/filter.c

> index 10dac9dd5086..605fbbdd694b 100644

> --- a/net/core/filter.c

> +++ b/net/core/filter.c

> @@ -9799,12 +9799,23 @@ const struct bpf_prog_ops sk_filter_prog_ops = {

>         .test_run               = bpf_prog_test_run_skb,

>  };

>

> +BTF_SET_START(bpf_tc_cls_kfunc_ids)

> +BTF_ID(func, bpf_kfunc_call_test1)

> +BTF_ID(func, bpf_kfunc_call_test2)

> +BTF_SET_END(bpf_tc_cls_kfunc_ids)

> +

> +static bool tc_cls_check_kern_func_call(u32 kfunc_id)

> +{

> +       return btf_id_set_contains(&bpf_tc_cls_kfunc_ids, kfunc_id);

> +}

> +

>  const struct bpf_verifier_ops tc_cls_act_verifier_ops = {

>         .get_func_proto         = tc_cls_act_func_proto,

>         .is_valid_access        = tc_cls_act_is_valid_access,

>         .convert_ctx_access     = tc_cls_act_convert_ctx_access,

>         .gen_prologue           = tc_cls_act_prologue,

>         .gen_ld_abs             = bpf_gen_ld_abs,

> +       .check_kern_func_call   = tc_cls_check_kern_func_call,

>  };

>

>  const struct bpf_prog_ops tc_cls_act_prog_ops = {

> diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c

> new file mode 100644

> index 000000000000..3850e6cc0a7d

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c

> @@ -0,0 +1,61 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +#include <test_progs.h>

> +#include <network_helpers.h>

> +#include "kfunc_call_test.skel.h"

> +#include "kfunc_call_test_subprog.skel.h"

> +

> +static __u32 duration;

> +


you shouldn't need it, you don't use CHECK()s

> +static void test_main(void)

> +{

> +       struct kfunc_call_test *skel;

> +       int prog_fd, retval, err;

> +

> +       skel = kfunc_call_test__open_and_load();

> +       if (!ASSERT_OK_PTR(skel, "skel"))

> +               return;

> +

> +       prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);

> +       err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),

> +                               NULL, NULL, (__u32 *)&retval, &duration);

> +

> +       if (ASSERT_OK(err, "bpf_prog_test_run(test1)"))

> +               ASSERT_EQ(retval, 12, "test1-retval");


there is no harm in doing retval check unconditionally. If something
goes wrong, you'll both know that err != 0 and what retval you got (if
you ever care, but if not, it doesn't hurt either). Same below.

> +

> +       prog_fd = bpf_program__fd(skel->progs.kfunc_call_test2);

> +       err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),

> +                               NULL, NULL, (__u32 *)&retval, &duration);

> +       if (ASSERT_OK(err, "bpf_prog_test_run(test2)"))

> +               ASSERT_EQ(retval, 3, "test2-retval");

> +

> +       kfunc_call_test__destroy(skel);

> +}

> +

> +static void test_subprog(void)

> +{

> +       struct kfunc_call_test_subprog *skel;

> +       int prog_fd, retval, err;

> +

> +       skel = kfunc_call_test_subprog__open_and_load();

> +       if (!ASSERT_OK_PTR(skel, "skel"))

> +               return;

> +

> +       prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);

> +       err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),

> +                               NULL, NULL, (__u32 *)&retval, &duration);

> +

> +       if (ASSERT_OK(err, "bpf_prog_test_run(test1)"))

> +               ASSERT_EQ(retval, 10, "test1-retval");

> +

> +       kfunc_call_test_subprog__destroy(skel);

> +}

> +

> +void test_kfunc_call(void)

> +{

> +       if (test__start_subtest("main"))

> +               test_main();

> +

> +       if (test__start_subtest("subprog"))

> +               test_subprog();

> +}

> diff --git a/tools/testing/selftests/bpf/progs/kfunc_call_test.c b/tools/testing/selftests/bpf/progs/kfunc_call_test.c

> new file mode 100644

> index 000000000000..ea8c5266efd8

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/progs/kfunc_call_test.c

> @@ -0,0 +1,48 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +#include <linux/bpf.h>

> +#include <bpf/bpf_helpers.h>

> +#include "bpf_tcp_helpers.h"

> +

> +extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b,

> +                                 __u32 c, __u64 d) __ksym;

> +extern int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym;

> +

> +SEC("classifier/test2")

> +int kfunc_call_test2(struct __sk_buff *skb)

> +{

> +       struct bpf_sock *sk = skb->sk;

> +

> +       if (!sk)

> +               return -1;

> +

> +       sk = bpf_sk_fullsock(sk);

> +       if (!sk)

> +               return -1;

> +

> +       return bpf_kfunc_call_test2((struct sock *)sk, 1, 2);

> +}

> +

> +SEC("classifier/test1")


please use just SEC("classifier") here and above, libbpf will handle
that properly

> +int kfunc_call_test1(struct __sk_buff *skb)

> +{

> +       struct bpf_sock *sk = skb->sk;

> +       __u64 a = 1ULL << 32;

> +       __u32 ret;

> +

> +       if (!sk)

> +               return -1;

> +

> +       sk = bpf_sk_fullsock(sk);

> +       if (!sk)

> +               return -1;

> +

> +       a = bpf_kfunc_call_test1((struct sock *)sk, 1, a | 2, 3, a | 4);

> +

> +       ret = a >> 32;   /* ret should be 2 */

> +       ret += (__u32)a; /* ret should be 12 */

> +

> +       return ret;

> +}

> +

> +char _license[] SEC("license") = "GPL";

> diff --git a/tools/testing/selftests/bpf/progs/kfunc_call_test_subprog.c b/tools/testing/selftests/bpf/progs/kfunc_call_test_subprog.c

> new file mode 100644

> index 000000000000..9bf66f8c826e

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/progs/kfunc_call_test_subprog.c

> @@ -0,0 +1,31 @@

> +// SPDX-License-Identifier: GPL-2.0

> +/* Copyright (c) 2021 Facebook */

> +#include <linux/bpf.h>

> +#include <bpf/bpf_helpers.h>

> +#include "bpf_tcp_helpers.h"

> +

> +extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b,

> +                                 __u32 c, __u64 d) __ksym;

> +

> +__attribute__ ((noinline))


__noinline

> +int f1(struct __sk_buff *skb)

> +{

> +       struct bpf_sock *sk = skb->sk;

> +

> +       if (!sk)

> +               return -1;

> +

> +       sk = bpf_sk_fullsock(sk);

> +       if (!sk)

> +               return -1;

> +

> +       return (__u32)bpf_kfunc_call_test1((struct sock *)sk, 1, 2, 3, 4);

> +}

> +

> +SEC("classifier/test1_subprog")


same, just "classifier"

> +int kfunc_call_test1(struct __sk_buff *skb)

> +{

> +       return f1(skb);

> +}

> +

> +char _license[] SEC("license") = "GPL";

> --

> 2.30.2

>

Martin KaFai Lau March 19, 2021, 5:40 a.m. UTC | #7

On Thu, Mar 18, 2021 at 09:21:08PM -0700, Andrii Nakryiko wrote:
> > --- /dev/null

> > +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c

> > @@ -0,0 +1,61 @@

> > +// SPDX-License-Identifier: GPL-2.0

> > +/* Copyright (c) 2021 Facebook */

> > +#include <test_progs.h>

> > +#include <network_helpers.h>

> > +#include "kfunc_call_test.skel.h"

> > +#include "kfunc_call_test_subprog.skel.h"

> > +

> > +static __u32 duration;

> > +

> 

> you shouldn't need it, you don't use CHECK()s

It was for bpf_prog_test_run().
Just noticed it can take NULL.  will remove in v2.

Martin KaFai Lau March 19, 2021, 7:32 p.m. UTC | #8

On Thu, Mar 18, 2021 at 04:32:47PM -0700, Andrii Nakryiko wrote:
> On Tue, Mar 16, 2021 at 12:01 AM Martin KaFai Lau <kafai@fb.com> wrote:

> >

> > This patch refactors the core logic of "btf_check_func_arg_match()"

> > into a new function "do_btf_check_func_arg_match()".

> > "do_btf_check_func_arg_match()" will be reused later to check

> > the kernel function call.

> >

> > The "if (!btf_type_is_ptr(t))" is checked first to improve the indentation

> > which will be useful for a later patch.

> >

> > Some of the "btf_kind_str[]" usages is replaced with the shortcut

> > "btf_type_str(t)".

> >

> > Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> > ---

> >  include/linux/btf.h |   5 ++

> >  kernel/bpf/btf.c    | 159 ++++++++++++++++++++++++--------------------

> >  2 files changed, 91 insertions(+), 73 deletions(-)

> >

> > diff --git a/include/linux/btf.h b/include/linux/btf.h

> > index 7fabf1428093..93bf2e5225f5 100644

> > --- a/include/linux/btf.h

> > +++ b/include/linux/btf.h

> > @@ -140,6 +140,11 @@ static inline bool btf_type_is_enum(const struct btf_type *t)

> >         return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;

> >  }

> >

> > +static inline bool btf_type_is_scalar(const struct btf_type *t)

> > +{

> > +       return btf_type_is_int(t) || btf_type_is_enum(t);

> > +}

> > +

> >  static inline bool btf_type_is_typedef(const struct btf_type *t)

> >  {

> >         return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF;

> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c

> > index 96cd24020a38..529b94b601c6 100644

> > --- a/kernel/bpf/btf.c

> > +++ b/kernel/bpf/btf.c

> > @@ -4381,7 +4381,7 @@ static u8 bpf_ctx_convert_map[] = {

> >  #undef BPF_LINK_TYPE

> >

> >  static const struct btf_member *

> > -btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,

> > +btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,

> >                       const struct btf_type *t, enum bpf_prog_type prog_type,

> >                       int arg)

> >  {

> > @@ -5366,122 +5366,135 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr

> >         return btf_check_func_type_match(log, btf1, t1, btf2, t2);

> >  }

> >

> > -/* Compare BTF of a function with given bpf_reg_state.

> > - * Returns:

> > - * EFAULT - there is a verifier bug. Abort verification.

> > - * EINVAL - there is a type mismatch or BTF is not available.

> > - * 0 - BTF matches with what bpf_reg_state expects.

> > - * Only PTR_TO_CTX and SCALAR_VALUE states are recognized.

> > - */

> > -int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,

> > -                            struct bpf_reg_state *regs)

> > +static int do_btf_check_func_arg_match(struct bpf_verifier_env *env,

> 

> do_btf_check_func_arg_match vs btf_check_func_arg_match distinction is

> not clear at all. How about something like

> 

> btf_check_func_arg_match vs btf_check_subprog_arg_match (or btf_func

> vs bpf_subprog). I think that highlights the main distinction better,

> no?

will rename.

> 

> > +                                      const struct btf *btf, u32 func_id,

> > +                                      struct bpf_reg_state *regs,

> > +                                      bool ptr_to_mem_ok)

> >  {

> >         struct bpf_verifier_log *log = &env->log;

> > -       struct bpf_prog *prog = env->prog;

> > -       struct btf *btf = prog->aux->btf;

> > -       const struct btf_param *args;

> > +       const char *func_name, *ref_tname;

> >         const struct btf_type *t, *ref_t;

> > -       u32 i, nargs, btf_id, type_size;

> > -       const char *tname;

> > -       bool is_global;

> > -

> > -       if (!prog->aux->func_info)

> > -               return -EINVAL;

> > -

> > -       btf_id = prog->aux->func_info[subprog].type_id;

> > -       if (!btf_id)

> > -               return -EFAULT;

> > -

> > -       if (prog->aux->func_info_aux[subprog].unreliable)

> > -               return -EINVAL;

> > +       const struct btf_param *args;

> > +       u32 i, nargs;

> >

> > -       t = btf_type_by_id(btf, btf_id);

> > +       t = btf_type_by_id(btf, func_id);

> >         if (!t || !btf_type_is_func(t)) {

> >                 /* These checks were already done by the verifier while loading

> >                  * struct bpf_func_info

> >                  */

> > -               bpf_log(log, "BTF of func#%d doesn't point to KIND_FUNC\n",

> > -                       subprog);

> > +               bpf_log(log, "BTF of func_id %u doesn't point to KIND_FUNC\n",

> > +                       func_id);

> >                 return -EFAULT;

> >         }

> > -       tname = btf_name_by_offset(btf, t->name_off);

> > +       func_name = btf_name_by_offset(btf, t->name_off);

> >

> >         t = btf_type_by_id(btf, t->type);

> >         if (!t || !btf_type_is_func_proto(t)) {

> > -               bpf_log(log, "Invalid BTF of func %s\n", tname);

> > +               bpf_log(log, "Invalid BTF of func %s\n", func_name);

> >                 return -EFAULT;

> >         }

> >         args = (const struct btf_param *)(t + 1);

> >         nargs = btf_type_vlen(t);

> >         if (nargs > MAX_BPF_FUNC_REG_ARGS) {

> > -               bpf_log(log, "Function %s has %d > %d args\n", tname, nargs,

> > +               bpf_log(log, "Function %s has %d > %d args\n", func_name, nargs,

> >                         MAX_BPF_FUNC_REG_ARGS);

> > -               goto out;

> > +               return -EINVAL;

> >         }

> >

> > -       is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;

> >         /* check that BTF function arguments match actual types that the

> >          * verifier sees.

> >          */

> >         for (i = 0; i < nargs; i++) {

> > -               struct bpf_reg_state *reg = &regs[i + 1];

> > +               u32 regno = i + 1;

> > +               struct bpf_reg_state *reg = &regs[regno];

> >

> > -               t = btf_type_by_id(btf, args[i].type);

> > -               while (btf_type_is_modifier(t))

> > -                       t = btf_type_by_id(btf, t->type);

> > -               if (btf_type_is_int(t) || btf_type_is_enum(t)) {

> > +               t = btf_type_skip_modifiers(btf, args[i].type, NULL);

> > +               if (btf_type_is_scalar(t)) {

> >                         if (reg->type == SCALAR_VALUE)

> >                                 continue;

> > -                       bpf_log(log, "R%d is not a scalar\n", i + 1);

> > -                       goto out;

> > +                       bpf_log(log, "R%d is not a scalar\n", regno);

> > +                       return -EINVAL;

> >                 }

> > -               if (btf_type_is_ptr(t)) {

> > +

> > +               if (!btf_type_is_ptr(t)) {

> > +                       bpf_log(log, "Unrecognized arg#%d type %s\n",

> > +                               i, btf_type_str(t));

> > +                       return -EINVAL;

> > +               }

> > +

> > +               ref_t = btf_type_skip_modifiers(btf, t->type, NULL);

> > +               ref_tname = btf_name_by_offset(btf, ref_t->name_off);

> 

> these two seem to be used only inside else `if (ptr_to_mem_ok)`, let's

> move the code and variables inside that branch?

It is kept here because the next patch uses it in
another case also.

> 

> > +               if (btf_get_prog_ctx_type(log, btf, t, env->prog->type, i)) {

> >                         /* If function expects ctx type in BTF check that caller

> >                          * is passing PTR_TO_CTX.

> >                          */

> > -                       if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {

> > -                               if (reg->type != PTR_TO_CTX) {

> > -                                       bpf_log(log,

> > -                                               "arg#%d expected pointer to ctx, but got %s\n",

> > -                                               i, btf_kind_str[BTF_INFO_KIND(t->info)]);

> > -                                       goto out;

> > -                               }

> > -                               if (check_ctx_reg(env, reg, i + 1))

> > -                                       goto out;

> > -                               continue;

> > +                       if (reg->type != PTR_TO_CTX) {

> > +                               bpf_log(log,

> > +                                       "arg#%d expected pointer to ctx, but got %s\n",

> > +                                       i, btf_type_str(t));

> > +                               return -EINVAL;

> >                         }

> > +                       if (check_ctx_reg(env, reg, regno))

> > +                               return -EINVAL;

> 

> original code had `continue` here allowing to stop tracking if/else

> logic. Any specific reason you removed it? It keeps logic simpler to

> follow, imo.

There is no other case after this.
"continue" becomes redundant, so removed.

> 

> > +               } else if (ptr_to_mem_ok) {

> 

> similarly to how you did reduction of nestedness with btf_type_is_ptr, I'd do

> 

> if (!ptr_to_mem_ok)

>     return -EINVAL;

> 

> and let brain forget about another if/else branch tracking

I don't see a significant difference.  Either way looks the same with
a few more test cases, IMO.

I prefer to keep it like this since there is
another test case added in the next patch.

There are usages with much longer if-else-if statement inside a
loop in the verifier also without explicit "continue" in the middle
or handle the last case differently and they are very readable.

> 

> > +                       const struct btf_type *resolve_ret;

> > +                       u32 type_size;

> >

> > -                       if (!is_global)

> > -                               goto out;

> > -

> > -                       t = btf_type_skip_modifiers(btf, t->type, NULL);

> > -

> > -                       ref_t = btf_resolve_size(btf, t, &type_size);

> > -                       if (IS_ERR(ref_t)) {

> > +                       resolve_ret = btf_resolve_size(btf, ref_t, &type_size);

> > +                       if (IS_ERR(resolve_ret)) {

> >                                 bpf_log(log,

> > -                                   "arg#%d reference type('%s %s') size cannot be determined: %ld\n",

> > -                                   i, btf_type_str(t), btf_name_by_offset(btf, t->name_off),

> > -                                       PTR_ERR(ref_t));

> > -                               goto out;

> > +                                       "arg#%d reference type('%s %s') size cannot be determined: %ld\n",

> > +                                       i, btf_type_str(ref_t), ref_tname,

> > +                                       PTR_ERR(resolve_ret));

> > +                               return -EINVAL;

> >                         }

> >

> > -                       if (check_mem_reg(env, reg, i + 1, type_size))

> > -                               goto out;

> > -

> > -                       continue;

> > +                       if (check_mem_reg(env, reg, regno, type_size))

> > +                               return -EINVAL;

> > +               } else {

> > +                       return -EINVAL;

> >                 }

> > -               bpf_log(log, "Unrecognized arg#%d type %s\n",

> > -                       i, btf_kind_str[BTF_INFO_KIND(t->info)]);

> > -               goto out;

> >         }

> > +

> >         return 0;

> > -out:

> > +}

> > +

> > +/* Compare BTF of a function with given bpf_reg_state.

> > + * Returns:

> > + * EFAULT - there is a verifier bug. Abort verification.

> > + * EINVAL - there is a type mismatch or BTF is not available.

> > + * 0 - BTF matches with what bpf_reg_state expects.

> > + * Only PTR_TO_CTX and SCALAR_VALUE states are recognized.

> > + */

> > +int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,

> > +                            struct bpf_reg_state *regs)

> > +{

> > +       struct bpf_prog *prog = env->prog;

> > +       struct btf *btf = prog->aux->btf;

> > +       bool is_global;

> > +       u32 btf_id;

> > +       int err;

> > +

> > +       if (!prog->aux->func_info)

> > +               return -EINVAL;

> > +

> > +       btf_id = prog->aux->func_info[subprog].type_id;

> > +       if (!btf_id)

> > +               return -EFAULT;

> > +

> > +       if (prog->aux->func_info_aux[subprog].unreliable)

> > +               return -EINVAL;

> > +

> > +       is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;

> > +       err = do_btf_check_func_arg_match(env, btf, btf_id, regs, is_global);

> > +

> >         /* Compiler optimizations can remove arguments from static functions

> >          * or mismatched type can be passed into a global function.

> >          * In such cases mark the function as unreliable from BTF point of view.

> >          */

> > -       prog->aux->func_info_aux[subprog].unreliable = true;

> > -       return -EINVAL;

> > +       if (err == -EINVAL)

> > +               prog->aux->func_info_aux[subprog].unreliable = true;

> 

> is there any harm marking it unreliable for any error? this makes it

> look like -EINVAL is super-special. If it's EFAULT, it won't matter,

> right?

will always assign true on any err.

> 

> > +       return err;

> >  }

> >

> >  /* Convert BTF of a function into bpf_reg_state if possible

> > --

> > 2.30.2

> >

Andrii Nakryiko March 19, 2021, 9:51 p.m. UTC | #9

On Fri, Mar 19, 2021 at 12:32 PM Martin KaFai Lau <kafai@fb.com> wrote:
>

> On Thu, Mar 18, 2021 at 04:32:47PM -0700, Andrii Nakryiko wrote:

> > On Tue, Mar 16, 2021 at 12:01 AM Martin KaFai Lau <kafai@fb.com> wrote:

> > >

> > > This patch refactors the core logic of "btf_check_func_arg_match()"

> > > into a new function "do_btf_check_func_arg_match()".

> > > "do_btf_check_func_arg_match()" will be reused later to check

> > > the kernel function call.

> > >

> > > The "if (!btf_type_is_ptr(t))" is checked first to improve the indentation

> > > which will be useful for a later patch.

> > >

> > > Some of the "btf_kind_str[]" usages is replaced with the shortcut

> > > "btf_type_str(t)".

> > >

> > > Signed-off-by: Martin KaFai Lau <kafai@fb.com>

> > > ---

> > >  include/linux/btf.h |   5 ++

> > >  kernel/bpf/btf.c    | 159 ++++++++++++++++++++++++--------------------

> > >  2 files changed, 91 insertions(+), 73 deletions(-)

> > >


[...]

> > > +               if (!btf_type_is_ptr(t)) {

> > > +                       bpf_log(log, "Unrecognized arg#%d type %s\n",

> > > +                               i, btf_type_str(t));

> > > +                       return -EINVAL;

> > > +               }

> > > +

> > > +               ref_t = btf_type_skip_modifiers(btf, t->type, NULL);

> > > +               ref_tname = btf_name_by_offset(btf, ref_t->name_off);

> >

> > these two seem to be used only inside else `if (ptr_to_mem_ok)`, let's

> > move the code and variables inside that branch?

> It is kept here because the next patch uses it in

> another case also.


yeah, I saw that once I got to that patch, never mind

>

> >

> > > +               if (btf_get_prog_ctx_type(log, btf, t, env->prog->type, i)) {

> > >                         /* If function expects ctx type in BTF check that caller

> > >                          * is passing PTR_TO_CTX.

> > >                          */

> > > -                       if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {

> > > -                               if (reg->type != PTR_TO_CTX) {

> > > -                                       bpf_log(log,

> > > -                                               "arg#%d expected pointer to ctx, but got %s\n",

> > > -                                               i, btf_kind_str[BTF_INFO_KIND(t->info)]);

> > > -                                       goto out;

> > > -                               }

> > > -                               if (check_ctx_reg(env, reg, i + 1))

> > > -                                       goto out;

> > > -                               continue;

> > > +                       if (reg->type != PTR_TO_CTX) {

> > > +                               bpf_log(log,

> > > +                                       "arg#%d expected pointer to ctx, but got %s\n",

> > > +                                       i, btf_type_str(t));

> > > +                               return -EINVAL;

> > >                         }

> > > +                       if (check_ctx_reg(env, reg, regno))

> > > +                               return -EINVAL;

> >

> > original code had `continue` here allowing to stop tracking if/else

> > logic. Any specific reason you removed it? It keeps logic simpler to

> > follow, imo.

> There is no other case after this.

> "continue" becomes redundant, so removed.


well, there is the entire "else if (ptr_to_mem_ok)" which now you need
to skip and go check if there is anything else that is supposed to
happen after if. `continue;`, on the other hand, makes it very clear
that nothing more is going to happen

>

> >

> > > +               } else if (ptr_to_mem_ok) {

> >

> > similarly to how you did reduction of nestedness with btf_type_is_ptr, I'd do

> >

> > if (!ptr_to_mem_ok)

> >     return -EINVAL;

> >

> > and let brain forget about another if/else branch tracking

> I don't see a significant difference.  Either way looks the same with

> a few more test cases, IMO.

>

> I prefer to keep it like this since there is

> another test case added in the next patch.

>

> There are usages with much longer if-else-if statement inside a

> loop in the verifier also without explicit "continue" in the middle

> or handle the last case differently and they are very readable.


It's a matter of taste, I suppose. I'd probably disagree with you on
the readability of those verifier parts ;) So it's up to you, of
course, but for me this code pattern:

for (...) {
    if (A) {
        handleA;
    } else if (B) {
        handleB;
    } else {
        return -EINVAL;
    }
}

is much harder to follow than more linear (imo)

for (...) {
    if (A) {
        handleA;
        continue;
    }

    if (!B)
        return -EINVAL;

    handleB;
}

especially if handleA and handleB are quite long and complicated.
Because I have to jump back and forth to validate that C is not
allowed/handled later, and that there is no common subsequent logic
for both A and B (or even C). In the latter code pattern there are
clear "only A" and "only B" logic and it's quite obvious that no C is
allowed/handled.

>

> >

> > > +                       const struct btf_type *resolve_ret;

> > > +                       u32 type_size;

> > >

> > > -                       if (!is_global)

> > > -                               goto out;

> > > -


[...]

Alexei Starovoitov March 20, 2021, 12:10 a.m. UTC | #10

On 3/19/21 2:51 PM, Andrii Nakryiko wrote:
> 

> It's a matter of taste, I suppose. I'd probably disagree with you on

> the readability of those verifier parts ;) So it's up to you, of

> course, but for me this code pattern:

> 

> for (...) {

>      if (A) {

>          handleA;

>      } else if (B) {

>          handleB;

>      } else {

>          return -EINVAL;

>      }

> }

> 

> is much harder to follow than more linear (imo)

> 

> for (...) {

>      if (A) {

>          handleA;

>          continue;

>      }

> 

>      if (!B)

>          return -EINVAL;

> 

>      handleB;

> }

> 

> especially if handleA and handleB are quite long and complicated.

> Because I have to jump back and forth to validate that C is not

> allowed/handled later, and that there is no common subsequent logic

> for both A and B (or even C). In the latter code pattern there are

> clear "only A" and "only B" logic and it's quite obvious that no C is

> allowed/handled.


my .02. I like the former (Martin's case) much better than the later.
We had few patterns like the later in the past and had to turn them
into the former because "case C" appeared.
In other words:
if (A)
else if (B)
else
   return

is much easier to extend for C and later convert to 'switch' with 'D':
less code churn, easier to refactor.

Andrii Nakryiko March 20, 2021, 5:13 p.m. UTC | #11

On Fri, Mar 19, 2021 at 5:10 PM Alexei Starovoitov <ast@fb.com> wrote:
>

> On 3/19/21 2:51 PM, Andrii Nakryiko wrote:

> >

> > It's a matter of taste, I suppose. I'd probably disagree with you on

> > the readability of those verifier parts ;) So it's up to you, of

> > course, but for me this code pattern:

> >

> > for (...) {

> >      if (A) {

> >          handleA;

> >      } else if (B) {

> >          handleB;

> >      } else {

> >          return -EINVAL;

> >      }

> > }

> >

> > is much harder to follow than more linear (imo)

> >

> > for (...) {

> >      if (A) {

> >          handleA;

> >          continue;

> >      }

> >

> >      if (!B)

> >          return -EINVAL;

> >

> >      handleB;

> > }

> >

> > especially if handleA and handleB are quite long and complicated.

> > Because I have to jump back and forth to validate that C is not

> > allowed/handled later, and that there is no common subsequent logic

> > for both A and B (or even C). In the latter code pattern there are

> > clear "only A" and "only B" logic and it's quite obvious that no C is

> > allowed/handled.

>

> my .02. I like the former (Martin's case) much better than the later.

> We had few patterns like the later in the past and had to turn them

> into the former because "case C" appeared.

> In other words:

> if (A)

> else if (B)

> else

>    return

>

> is much easier to extend for C and later convert to 'switch' with 'D':

> less code churn, easier to refactor.

I think code structure should reflect current logic, not be in
preparation for further potential extension, which might not even
happen. If there are only A and B possible, then code should make it
as clear as possible. But if we anticipate another case C, then

if (A) {
    handleA;
    continue;
}
if (B) {
    handle B;
    continue;
}
return -EINVAL;

Is still easier to follow and is easy to extend.

My original point was that `if () {} else if () {}` code structure
implies that there is or might be some common handling logic after
if/else, so at least my brain constantly worries about that and jumps
around in the code to validate that there isn't actually anything
else. And that gets progressively more harder with longer or more
complicated logic inside handleA and handleB.

Anyways, I'm not trying to enforce my personal style, I tried to show
that it's objectively superior from my brain's point of view. That
`continue` is "a pruning point", if you will. But I'm not trying to
convert anyone. Please proceed with whatever code structure you feel
is better.

[bpf-next,00/15] Support calling kernel function

Message

Comments