From patchwork Mon Apr 27 20:12:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 781D1C54FCB for ; Mon, 27 Apr 2020 20:14:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D4DA2072D for ; Mon, 27 Apr 2020 20:14:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="hbdzrCyq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726952AbgD0UN3 (ORCPT ); Mon, 27 Apr 2020 16:13:29 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:30478 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726849AbgD0UN2 (ORCPT ); Mon, 27 Apr 2020 16:13:28 -0400 Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RKBcKL003423 for ; Mon, 27 Apr 2020 13:13:28 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=Zt20xQdlfJLBte//mGUFbKnydXAnAYSLNsjyfy0EltE=; b=hbdzrCyq5Qao/aA+WwEGqQyNgzolLi7+K2WmvJyK+ZXl7+rg1XbyZSaTm3IQoIsySk70 af9ME65aYII07goiqlFTU9qdBazjRbGTRmfQaLsxE8DQsND/fr673836w6V76bg90psr +HqvPJmzZGA7aLd2/rnmX580Q80FMJ2gcW8= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 30n5bx246a-16 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:13:27 -0700 Received: from intmgw002.08.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:41 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id E79243700871; Mon, 27 Apr 2020 13:12:37 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 03/19] bpf: add bpf_map iterator Date: Mon, 27 Apr 2020 13:12:37 -0700 Message-ID: <20200427201237.2994794-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 mlxscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 clxscore=1015 impostorscore=0 adultscore=0 priorityscore=1501 spamscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270164 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The bpf_map iterator is implemented. The bpf program is called at seq_ops show() and stop() functions. bpf_iter_get_prog() will retrieve bpf program and other parameters during seq_file object traversal. In show() function, bpf program will traverse every valid object, and in stop() function, bpf program will be called one more time after all objects are traversed. The first member of the bpf context contains the meta data, namely, the seq_file, session_id and seq_num. Here, the session_id is a unique id for one specific seq_file session. The seq_num is the number of bpf prog invocations in the current session. The bpf_iter_get_prog(), which will be implemented in subsequent patches, will have more information on how meta data are computed. The second member of the bpf context is a struct bpf_map pointer, which bpf program can examine. The target implementation also provided the structure definition for bpf program and the function definition for verifier to verify the bpf program. Specifically for bpf_map iterator, the structure is "bpf_iter__bpf_map" andd the function is "__bpf_iter__bpf_map". More targets will be implemented later, all of which will include the following, similar to bpf_map iterator: - seq_ops() implementation - function definition for verifier to verify the bpf program - seq_file private data size - additional target feature Signed-off-by: Yonghong Song --- include/linux/bpf.h | 10 ++++ kernel/bpf/Makefile | 2 +- kernel/bpf/bpf_iter.c | 19 ++++++++ kernel/bpf/map_iter.c | 107 ++++++++++++++++++++++++++++++++++++++++++ kernel/bpf/syscall.c | 13 +++++ 5 files changed, 150 insertions(+), 1 deletion(-) create mode 100644 kernel/bpf/map_iter.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 5e56abc1e2f1..4ac8d61f7c3e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1078,6 +1078,7 @@ int generic_map_update_batch(struct bpf_map *map, int generic_map_delete_batch(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); +struct bpf_map *bpf_map_get_curr_or_next(u32 *id); extern int sysctl_unprivileged_bpf_disabled; @@ -1118,7 +1119,16 @@ struct bpf_iter_reg { u32 target_feature; }; +struct bpf_iter_meta { + __bpf_md_ptr(struct seq_file *, seq); + u64 session_id; + u64 seq_num; +}; + int bpf_iter_reg_target(struct bpf_iter_reg *reg_info); +struct bpf_prog *bpf_iter_get_prog(struct seq_file *seq, u32 priv_data_size, + u64 *session_id, u64 *seq_num, bool is_last); +int bpf_iter_run_prog(struct bpf_prog *prog, void *ctx); int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value); int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value); diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 6a8b0febd3f6..b2b5eefc5254 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -2,7 +2,7 @@ obj-y := core.o CFLAGS_core.o += $(call cc-disable-warning, override-init) -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index 1115b978607a..284c95587803 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -48,3 +48,22 @@ int bpf_iter_reg_target(struct bpf_iter_reg *reg_info) return 0; } + +struct bpf_prog *bpf_iter_get_prog(struct seq_file *seq, u32 priv_data_size, + u64 *session_id, u64 *seq_num, bool is_last) +{ + return NULL; +} + +int bpf_iter_run_prog(struct bpf_prog *prog, void *ctx) +{ + int ret; + + migrate_disable(); + rcu_read_lock(); + ret = BPF_PROG_RUN(prog, ctx); + rcu_read_unlock(); + migrate_enable(); + + return ret; +} diff --git a/kernel/bpf/map_iter.c b/kernel/bpf/map_iter.c new file mode 100644 index 000000000000..bb3ad4c3bde5 --- /dev/null +++ b/kernel/bpf/map_iter.c @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2020 Facebook */ +#include +#include +#include +#include + +struct bpf_iter_seq_map_info { + struct bpf_map *map; + u32 id; +}; + +static void *bpf_map_seq_start(struct seq_file *seq, loff_t *pos) +{ + struct bpf_iter_seq_map_info *info = seq->private; + struct bpf_map *map; + u32 id = info->id; + + map = bpf_map_get_curr_or_next(&id); + if (IS_ERR_OR_NULL(map)) + return NULL; + + ++*pos; + info->map = map; + info->id = id; + return map; +} + +static void *bpf_map_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct bpf_iter_seq_map_info *info = seq->private; + struct bpf_map *map; + + ++*pos; + ++info->id; + map = bpf_map_get_curr_or_next(&info->id); + if (IS_ERR_OR_NULL(map)) + return NULL; + + bpf_map_put(info->map); + info->map = map; + return map; +} + +struct bpf_iter__bpf_map { + __bpf_md_ptr(struct bpf_iter_meta *, meta); + __bpf_md_ptr(struct bpf_map *, map); +}; + +int __init __bpf_iter__bpf_map(struct bpf_iter_meta *meta, struct bpf_map *map) +{ + return 0; +} + +static int bpf_map_seq_show(struct seq_file *seq, void *v) +{ + struct bpf_iter_meta meta; + struct bpf_iter__bpf_map ctx; + struct bpf_prog *prog; + int ret = 0; + + ctx.meta = &meta; + ctx.map = v; + meta.seq = seq; + prog = bpf_iter_get_prog(seq, sizeof(struct bpf_iter_seq_map_info), + &meta.session_id, &meta.seq_num, + v == (void *)0); + if (prog) + ret = bpf_iter_run_prog(prog, &ctx); + + return ret == 0 ? 0 : -EINVAL; +} + +static void bpf_map_seq_stop(struct seq_file *seq, void *v) +{ + struct bpf_iter_seq_map_info *info = seq->private; + + if (!v) + bpf_map_seq_show(seq, v); + + if (info->map) { + bpf_map_put(info->map); + info->map = NULL; + } +} + +static const struct seq_operations bpf_map_seq_ops = { + .start = bpf_map_seq_start, + .next = bpf_map_seq_next, + .stop = bpf_map_seq_stop, + .show = bpf_map_seq_show, +}; + +static int __init bpf_map_iter_init(void) +{ + struct bpf_iter_reg reg_info = { + .target = "bpf_map", + .target_func_name = "__bpf_iter__bpf_map", + .seq_ops = &bpf_map_seq_ops, + .seq_priv_size = sizeof(struct bpf_iter_seq_map_info), + .target_feature = 0, + }; + + return bpf_iter_reg_target(®_info); +} + +late_initcall(bpf_map_iter_init); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7626b8024471..022187640943 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2800,6 +2800,19 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr, return err; } +struct bpf_map *bpf_map_get_curr_or_next(u32 *id) +{ + struct bpf_map *map; + + spin_lock_bh(&map_idr_lock); + map = idr_get_next(&map_idr, id); + if (map) + map = __bpf_map_inc_not_zero(map, false); + spin_unlock_bh(&map_idr_lock); + + return map; +} + #define BPF_PROG_GET_FD_BY_ID_LAST_FIELD prog_id struct bpf_prog *bpf_prog_by_id(u32 id) From patchwork Mon Apr 27 20:12:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D199C81B42 for ; Mon, 27 Apr 2020 20:12:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E47EC2070B for ; Mon, 27 Apr 2020 20:12:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="RE2RnCXT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726791AbgD0UMq (ORCPT ); Mon, 27 Apr 2020 16:12:46 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:26852 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726661AbgD0UMo (ORCPT ); Mon, 27 Apr 2020 16:12:44 -0400 Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RKAomR031134 for ; Mon, 27 Apr 2020 13:12:44 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=FlGBMjFQ58EfoYMjgVzYQGqT38VwRmb0fAAp4dHeBEM=; b=RE2RnCXTYm2NqYToXxK0glrj9/9A/Nlkayold1Zn5igT/QhRWAZmvD99Vi8nD9t/Y/Ow +q4mmJJk3DYV0Ma9xsl2Dt3Wqju0teYKfP4jg7lAtUXW2F5IHRI+I/Fv6ejbVdiigbl5 ZzvtZl4kfhrVh/0y1AHTNlPVNLYyRufo4OE= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 30n57q245w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:12:44 -0700 Received: from intmgw004.08.frc2.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:43 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id 2BF2F3700871; Mon, 27 Apr 2020 13:12:39 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 04/19] bpf: allow loading of a bpf_iter program Date: Mon, 27 Apr 2020 13:12:39 -0700 Message-ID: <20200427201239.2994896-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 suspectscore=0 lowpriorityscore=0 mlxscore=0 spamscore=0 phishscore=0 adultscore=0 clxscore=1015 impostorscore=0 malwarescore=0 mlxlogscore=999 priorityscore=1501 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270164 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org A bpf_iter program is a tracing program with attach type BPF_TRACE_ITER. The load attribute attach_btf_id is used by the verifier against a particular kernel function, e.g., __bpf_iter__bpf_map in our previous bpf_map iterator. The program return value must be 0 for now. In the future, other return values may be used for filtering or teminating the iterator. Signed-off-by: Yonghong Song --- include/uapi/linux/bpf.h | 1 + kernel/bpf/verifier.c | 20 ++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 1 + 3 files changed, 22 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 4a6c47f3febe..f39b9fec37ab 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -215,6 +215,7 @@ enum bpf_attach_type { BPF_TRACE_FEXIT, BPF_MODIFY_RETURN, BPF_LSM_MAC, + BPF_TRACE_ITER, __MAX_BPF_ATTACH_TYPE }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 91728e0f27eb..fd36c22685d9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -7074,6 +7074,11 @@ static int check_return_code(struct bpf_verifier_env *env) return 0; range = tnum_const(0); break; + case BPF_PROG_TYPE_TRACING: + if (env->prog->expected_attach_type != BPF_TRACE_ITER) + return 0; + range = tnum_const(0); + break; default: return 0; } @@ -10454,6 +10459,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) struct bpf_prog *tgt_prog = prog->aux->linked_prog; u32 btf_id = prog->aux->attach_btf_id; const char prefix[] = "btf_trace_"; + struct btf_func_model fmodel; int ret = 0, subprog = -1, i; struct bpf_trampoline *tr; const struct btf_type *t; @@ -10595,6 +10601,20 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) prog->aux->attach_func_proto = t; prog->aux->attach_btf_trace = true; return 0; + case BPF_TRACE_ITER: + if (!btf_type_is_func(t)) { + verbose(env, "attach_btf_id %u is not a function\n", + btf_id); + return -EINVAL; + } + t = btf_type_by_id(btf, t->type); + if (!btf_type_is_func_proto(t)) + return -EINVAL; + prog->aux->attach_func_name = tname; + prog->aux->attach_func_proto = t; + ret = btf_distill_func_proto(&env->log, btf, t, + tname, &fmodel); + return ret; default: if (!prog_extension) return -EINVAL; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 4a6c47f3febe..f39b9fec37ab 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -215,6 +215,7 @@ enum bpf_attach_type { BPF_TRACE_FEXIT, BPF_MODIFY_RETURN, BPF_LSM_MAC, + BPF_TRACE_ITER, __MAX_BPF_ATTACH_TYPE }; From patchwork Mon Apr 27 20:12:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ABFEC4CECD for ; Mon, 27 Apr 2020 20:12:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E1AA2070B for ; Mon, 27 Apr 2020 20:12:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="jOec8IbX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726878AbgD0UMz (ORCPT ); Mon, 27 Apr 2020 16:12:55 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:48472 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726818AbgD0UMt (ORCPT ); Mon, 27 Apr 2020 16:12:49 -0400 Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RK4igA019172 for ; Mon, 27 Apr 2020 13:12:48 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=xeYLGxkD1hUz5mfzBgArcwzev+JPL6yY1fUDybsXNoo=; b=jOec8IbXN3u/tmHLvA89hKY/Q0GjYZnPkf75KlOtixzTGui79IkLl3avfm4olg3ZASCY zUpkGfBWbI9NBqEFcCkwYE9AIyM7sU3P4/Nlj2fyv2WL8DC5w/LSFJ/ply3LM59oUgxS BS7GCdDGbn/ghwldqzVfDJlGfqXRJRhAj7g= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 30nq53mw3w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:12:47 -0700 Received: from intmgw003.08.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:46 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id 1FBE83700871; Mon, 27 Apr 2020 13:12:44 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 08/19] bpf: create file bpf iterator Date: Mon, 27 Apr 2020 13:12:44 -0700 Message-ID: <20200427201244.2995241-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 mlxlogscore=745 spamscore=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 priorityscore=1501 phishscore=0 bulkscore=0 malwarescore=0 adultscore=0 mlxscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270163 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org A new obj type BPF_TYPE_ITER is added to bpffs. To produce a file bpf iterator, the fd must be corresponding to a link_fd assocciated with a trace/iter program. When the pinned file is opened, a seq_file will be generated. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 3 +++ kernel/bpf/bpf_iter.c | 48 ++++++++++++++++++++++++++++++++++++++++++- kernel/bpf/inode.c | 28 +++++++++++++++++++++++++ kernel/bpf/syscall.c | 2 +- 4 files changed, 79 insertions(+), 2 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 0f0cafc65a04..601b3299b7e4 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1021,6 +1021,8 @@ static inline void bpf_enable_instrumentation(void) extern const struct file_operations bpf_map_fops; extern const struct file_operations bpf_prog_fops; +extern const struct file_operations bpf_link_fops; +extern const struct file_operations bpffs_iter_fops; #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) \ extern const struct bpf_prog_ops _name ## _prog_ops; \ @@ -1136,6 +1138,7 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); int bpf_iter_link_replace(struct bpf_link *link, struct bpf_prog *old_prog, struct bpf_prog *new_prog); int bpf_iter_new_fd(struct bpf_link *link); +void *bpf_iter_get_from_fd(u32 ufd); int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value); int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value); diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index 1f4e778d1814..f5e933236996 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -123,7 +123,8 @@ struct bpf_prog *bpf_iter_get_prog(struct seq_file *seq, u32 priv_data_size, { struct extra_priv_data *extra_data; - if (seq->file->f_op != &anon_bpf_iter_fops) + if (seq->file->f_op != &anon_bpf_iter_fops && + seq->file->f_op != &bpffs_iter_fops) return NULL; extra_data = get_extra_priv_dptr(seq->private, priv_data_size); @@ -310,3 +311,48 @@ int bpf_iter_new_fd(struct bpf_link *link) put_unused_fd(fd); return err; } + +static int bpffs_iter_open(struct inode *inode, struct file *file) +{ + struct bpf_iter_link *link = inode->i_private; + + return prepare_seq_file(file, link); +} + +static int bpffs_iter_release(struct inode *inode, struct file *file) +{ + return anon_iter_release(inode, file); +} + +const struct file_operations bpffs_iter_fops = { + .open = bpffs_iter_open, + .read = seq_read, + .release = bpffs_iter_release, +}; + +void *bpf_iter_get_from_fd(u32 ufd) +{ + struct bpf_link *link; + struct bpf_prog *prog; + struct fd f; + + f = fdget(ufd); + if (!f.file) + return ERR_PTR(-EBADF); + if (f.file->f_op != &bpf_link_fops) { + link = ERR_PTR(-EINVAL); + goto out; + } + + link = f.file->private_data; + prog = link->prog; + if (prog->expected_attach_type != BPF_TRACE_ITER) { + link = ERR_PTR(-EINVAL); + goto out; + } + + bpf_link_inc(link); +out: + fdput(f); + return link; +} diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 95087d9f4ed3..de4493983a37 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -26,6 +26,7 @@ enum bpf_type { BPF_TYPE_PROG, BPF_TYPE_MAP, BPF_TYPE_LINK, + BPF_TYPE_ITER, }; static void *bpf_any_get(void *raw, enum bpf_type type) @@ -38,6 +39,7 @@ static void *bpf_any_get(void *raw, enum bpf_type type) bpf_map_inc_with_uref(raw); break; case BPF_TYPE_LINK: + case BPF_TYPE_ITER: bpf_link_inc(raw); break; default: @@ -58,6 +60,7 @@ static void bpf_any_put(void *raw, enum bpf_type type) bpf_map_put_with_uref(raw); break; case BPF_TYPE_LINK: + case BPF_TYPE_ITER: bpf_link_put(raw); break; default: @@ -82,6 +85,15 @@ static void *bpf_fd_probe_obj(u32 ufd, enum bpf_type *type) return raw; } + /* check bpf_iter before bpf_link as + * ufd is also a link. + */ + raw = bpf_iter_get_from_fd(ufd); + if (!IS_ERR(raw)) { + *type = BPF_TYPE_ITER; + return raw; + } + raw = bpf_link_get_from_fd(ufd); if (!IS_ERR(raw)) { *type = BPF_TYPE_LINK; @@ -96,6 +108,7 @@ static const struct inode_operations bpf_dir_iops; static const struct inode_operations bpf_prog_iops = { }; static const struct inode_operations bpf_map_iops = { }; static const struct inode_operations bpf_link_iops = { }; +static const struct inode_operations bpf_iter_iops = { }; static struct inode *bpf_get_inode(struct super_block *sb, const struct inode *dir, @@ -135,6 +148,8 @@ static int bpf_inode_type(const struct inode *inode, enum bpf_type *type) *type = BPF_TYPE_MAP; else if (inode->i_op == &bpf_link_iops) *type = BPF_TYPE_LINK; + else if (inode->i_op == &bpf_iter_iops) + *type = BPF_TYPE_ITER; else return -EACCES; @@ -362,6 +377,12 @@ static int bpf_mklink(struct dentry *dentry, umode_t mode, void *arg) &bpffs_obj_fops); } +static int bpf_mkiter(struct dentry *dentry, umode_t mode, void *arg) +{ + return bpf_mkobj_ops(dentry, mode, arg, &bpf_iter_iops, + &bpffs_iter_fops); +} + static struct dentry * bpf_lookup(struct inode *dir, struct dentry *dentry, unsigned flags) { @@ -441,6 +462,9 @@ static int bpf_obj_do_pin(const char __user *pathname, void *raw, case BPF_TYPE_LINK: ret = vfs_mkobj(dentry, mode, bpf_mklink, raw); break; + case BPF_TYPE_ITER: + ret = vfs_mkobj(dentry, mode, bpf_mkiter, raw); + break; default: ret = -EPERM; } @@ -519,6 +543,8 @@ int bpf_obj_get_user(const char __user *pathname, int flags) ret = bpf_map_new_fd(raw, f_flags); else if (type == BPF_TYPE_LINK) ret = bpf_link_new_fd(raw); + else if (type == BPF_TYPE_ITER) + ret = bpf_iter_new_fd(raw); else return -ENOENT; @@ -538,6 +564,8 @@ static struct bpf_prog *__get_prog_inode(struct inode *inode, enum bpf_prog_type return ERR_PTR(-EINVAL); if (inode->i_op == &bpf_link_iops) return ERR_PTR(-EINVAL); + if (inode->i_op == &bpf_iter_iops) + return ERR_PTR(-EINVAL); if (inode->i_op != &bpf_prog_iops) return ERR_PTR(-EACCES); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 458f7000887a..e9ca5fbe8723 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2285,7 +2285,7 @@ static void bpf_link_show_fdinfo(struct seq_file *m, struct file *filp) } #endif -static const struct file_operations bpf_link_fops = { +const struct file_operations bpf_link_fops = { #ifdef CONFIG_PROC_FS .show_fdinfo = bpf_link_show_fdinfo, #endif From patchwork Mon Apr 27 20:12:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 062A7C81B40 for ; Mon, 27 Apr 2020 20:12:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DFCB22070B for ; Mon, 27 Apr 2020 20:12:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="mHjH3w9m" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726834AbgD0UMv (ORCPT ); Mon, 27 Apr 2020 16:12:51 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:59770 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726808AbgD0UMs (ORCPT ); Mon, 27 Apr 2020 16:12:48 -0400 Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RK1qBx010977 for ; Mon, 27 Apr 2020 13:12:47 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=uld8nY+2JNfoPGysCtL96iyhjSjBnDTWmEHtfezHuhw=; b=mHjH3w9m//5n/aEE6aW2heEGPyTpByj+P/qwq0XzGzxPTAmOhXcNQxR5eerKWNK4qf1I IKuZdQVT3vsy+YATA36b8vEuFBX97q708IWXHMVak2O3UHlf7YYiJfCNZRVRLGLIhrab 7UvNS5oSR5GQERmk98eRpUdBQo2vthbtvoQ= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 30mk1gdyed-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:12:47 -0700 Received: from intmgw003.03.ash8.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:46 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id 590D43700871; Mon, 27 Apr 2020 13:12:45 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 09/19] bpf: add PTR_TO_BTF_ID_OR_NULL support Date: Mon, 27 Apr 2020 13:12:45 -0700 Message-ID: <20200427201245.2995342-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 impostorscore=0 suspectscore=0 mlxlogscore=938 priorityscore=1501 lowpriorityscore=0 mlxscore=0 malwarescore=0 bulkscore=0 spamscore=0 adultscore=0 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270161 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add bpf_reg_type PTR_TO_BTF_ID_OR_NULL support. For tracing/iter program, the bpf program context definition, e.g., for previous bpf_map target, looks like struct bpf_iter_bpf_map { struct bpf_dump_meta *meta; struct bpf_map *map; }; The kernel guarantees that meta is not NULL, but map pointer maybe NULL. The NULL map indicates that all objects have been traversed, so bpf program can take proper action, e.g., do final aggregation and/or send final report to user space. Add btf_id_or_null_non0_off to prog->aux structure, to indicate that for tracing programs, if the context access offset is not 0, set to PTR_TO_BTF_ID_OR_NULL instead of PTR_TO_BTF_ID. This bit is set for tracing/iter program. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 2 ++ kernel/bpf/btf.c | 5 ++++- kernel/bpf/verifier.c | 19 ++++++++++++++----- 3 files changed, 20 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 601b3299b7e4..d30cf0544ab0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -320,6 +320,7 @@ enum bpf_reg_type { PTR_TO_TP_BUFFER, /* reg points to a writable raw tp's buffer */ PTR_TO_XDP_SOCK, /* reg points to struct xdp_sock */ PTR_TO_BTF_ID, /* reg points to kernel struct */ + PTR_TO_BTF_ID_OR_NULL, /* reg points to kernel struct or NULL */ }; /* The information passed from prog-specific *_is_valid_access @@ -658,6 +659,7 @@ struct bpf_prog_aux { bool offload_requested; bool attach_btf_trace; /* true if attaching to BTF-enabled raw tp */ bool func_proto_unreliable; + bool btf_id_or_null_non0_off; enum bpf_tramp_prog_type trampoline_prog_type; struct bpf_trampoline *trampoline; struct hlist_node tramp_hlist; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index d65c6912bdaf..2c098e6b1acc 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3788,7 +3788,10 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, return true; /* this is a pointer to another type */ - info->reg_type = PTR_TO_BTF_ID; + if (off != 0 && prog->aux->btf_id_or_null_non0_off) + info->reg_type = PTR_TO_BTF_ID_OR_NULL; + else + info->reg_type = PTR_TO_BTF_ID; if (tgt_prog) { ret = btf_translate_to_vmlinux(log, btf, t, tgt_prog->type, arg); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fd36c22685d9..21ec85e382ca 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -396,7 +396,8 @@ static bool reg_type_may_be_null(enum bpf_reg_type type) return type == PTR_TO_MAP_VALUE_OR_NULL || type == PTR_TO_SOCKET_OR_NULL || type == PTR_TO_SOCK_COMMON_OR_NULL || - type == PTR_TO_TCP_SOCK_OR_NULL; + type == PTR_TO_TCP_SOCK_OR_NULL || + type == PTR_TO_BTF_ID_OR_NULL; } static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) @@ -410,7 +411,8 @@ static bool reg_type_may_be_refcounted_or_null(enum bpf_reg_type type) return type == PTR_TO_SOCKET || type == PTR_TO_SOCKET_OR_NULL || type == PTR_TO_TCP_SOCK || - type == PTR_TO_TCP_SOCK_OR_NULL; + type == PTR_TO_TCP_SOCK_OR_NULL || + type == PTR_TO_BTF_ID_OR_NULL; } static bool arg_type_may_be_refcounted(enum bpf_arg_type type) @@ -462,6 +464,7 @@ static const char * const reg_type_str[] = { [PTR_TO_TP_BUFFER] = "tp_buffer", [PTR_TO_XDP_SOCK] = "xdp_sock", [PTR_TO_BTF_ID] = "ptr_", + [PTR_TO_BTF_ID_OR_NULL] = "ptr_or_null_", }; static char slot_type_char[] = { @@ -522,7 +525,7 @@ static void print_verifier_state(struct bpf_verifier_env *env, /* reg->off should be 0 for SCALAR_VALUE */ verbose(env, "%lld", reg->var_off.value + reg->off); } else { - if (t == PTR_TO_BTF_ID) + if (t == PTR_TO_BTF_ID || t == PTR_TO_BTF_ID_OR_NULL) verbose(env, "%s", kernel_type_name(reg->btf_id)); verbose(env, "(id=%d", reg->id); if (reg_type_may_be_refcounted_or_null(t)) @@ -2118,6 +2121,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type) case PTR_TO_TCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: case PTR_TO_BTF_ID: + case PTR_TO_BTF_ID_OR_NULL: return true; default: return false; @@ -2638,7 +2642,7 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, */ *reg_type = info.reg_type; - if (*reg_type == PTR_TO_BTF_ID) + if (*reg_type == PTR_TO_BTF_ID || *reg_type == PTR_TO_BTF_ID_OR_NULL) *btf_id = info.btf_id; else env->insn_aux_data[insn_idx].ctx_field_size = info.ctx_field_size; @@ -3222,7 +3226,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn * a sub-register. */ regs[value_regno].subreg_def = DEF_NOT_SUBREG; - if (reg_type == PTR_TO_BTF_ID) + if (reg_type == PTR_TO_BTF_ID || + reg_type == PTR_TO_BTF_ID_OR_NULL) regs[value_regno].btf_id = btf_id; } regs[value_regno].type = reg_type; @@ -6545,6 +6550,8 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state, reg->type = PTR_TO_SOCK_COMMON; } else if (reg->type == PTR_TO_TCP_SOCK_OR_NULL) { reg->type = PTR_TO_TCP_SOCK; + } else if (reg->type == PTR_TO_BTF_ID_OR_NULL) { + reg->type = PTR_TO_BTF_ID; } if (is_null) { /* We don't need id and ref_obj_id from this point @@ -8403,6 +8410,7 @@ static bool reg_type_mismatch_ok(enum bpf_reg_type type) case PTR_TO_TCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: case PTR_TO_BTF_ID: + case PTR_TO_BTF_ID_OR_NULL: return false; default: return true; @@ -10612,6 +10620,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) return -EINVAL; prog->aux->attach_func_name = tname; prog->aux->attach_func_proto = t; + prog->aux->btf_id_or_null_non0_off = true; ret = btf_distill_func_proto(&env->log, btf, t, tname, &fmodel); return ret; From patchwork Mon Apr 27 20:12:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21A91C54EEB for ; Mon, 27 Apr 2020 20:13:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 04EE32070B for ; Mon, 27 Apr 2020 20:13:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="i3tp1E4I" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726904AbgD0UNF (ORCPT ); Mon, 27 Apr 2020 16:13:05 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:44458 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726691AbgD0UM4 (ORCPT ); Mon, 27 Apr 2020 16:12:56 -0400 Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RK1sOs011056 for ; Mon, 27 Apr 2020 13:12:54 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=Z5WzyvU6Bl8evAcaKXqHCQvCbQ9llNqfDDDZxr8FaQY=; b=i3tp1E4IRNYbXRLHTRVbl6tgqsB1tnDU8SXfNovpSTDgCG4WGP2c+8d75ASr9W65g8Is MSer/uSCw9FpZuxWLgAhASINUoVFayqUI2GAM7s4p4HObBKW5trtnlpZLPBu0HQL7nIB J86O2qKCTdshotSUI5nKNjvH9ZQs+jfvLgY= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 30mk1gdyeq-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:12:54 -0700 Received: from intmgw002.08.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::d) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:53 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id 0D89E3700871; Mon, 27 Apr 2020 13:12:49 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 12/19] bpf: add bpf_seq_printf and bpf_seq_write helpers Date: Mon, 27 Apr 2020 13:12:49 -0700 Message-ID: <20200427201249.2995688-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 impostorscore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 mlxscore=0 malwarescore=0 bulkscore=0 spamscore=0 adultscore=0 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270161 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Two helpers bpf_seq_printf and bpf_seq_write, are added for writing data to the seq_file buffer. bpf_seq_printf supports common format string flag/width/type fields so at least I can get identical results for netlink and ipv6_route targets. For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW specifically indicates a write failure due to overflow, which means the object will be repeated in the next bpf invocation if object collection stays the same. Note that if the object collection is changed, depending how collection traversal is done, even if the object still in the collection, it may not be visited. Signed-off-by: Yonghong Song --- include/uapi/linux/bpf.h | 31 ++++++- kernel/trace/bpf_trace.c | 159 +++++++++++++++++++++++++++++++++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 31 ++++++- 4 files changed, 221 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 576651110d16..f0ab17d8fb73 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3042,6 +3042,33 @@ union bpf_attr { * See: clock_gettime(CLOCK_BOOTTIME) * Return * Current *ktime*. + * + * int bpf_seq_printf(struct seq_file *m, const char *fmt, u32 fmt_size, const void *data, u32 data_len) + * Description + * seq_printf uses seq_file seq_printf() to print out the format string. + * The *m* represents the seq_file. The *fmt* and *fmt_size* are for + * the format string itself. The *data* and *data_len* are format string + * arguments. The *data* are a u64 array and corresponding format string + * values are stored in the array. For strings and pointers where pointees + * are accessed, only the pointer values are stored in the *data* array. + * The *data_len* is the *data* size in term of bytes. + * Return + * 0 on success, or a negative errno in case of failure. + * + * * **-EINVAL** Invalid arguments, or invalid/unsupported formats. + * * **-E2BIG** Too many format specifiers. + * * **-ENOMEM** No enough memory to copy pointees or strings. + * * **-EOVERFLOW** Overflow happens, the same object will be tried again. + * + * int bpf_seq_write(struct seq_file *m, const void *data, u32 len) + * Description + * seq_write uses seq_file seq_write() to write the data. + * The *m* represents the seq_file. The *data* and *len* represent the + * data to write in bytes. + * Return + * 0 on success, or a negative errno in case of failure. + * + * * **-EOVERFLOW** Overflow happens, the same object will be tried again. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3169,7 +3196,9 @@ union bpf_attr { FN(get_netns_cookie), \ FN(get_current_ancestor_cgroup_id), \ FN(sk_assign), \ - FN(ktime_get_boot_ns), + FN(ktime_get_boot_ns), \ + FN(seq_printf), \ + FN(seq_write), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index e875c95d3ced..f7c5587b5d2e 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -457,6 +457,161 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void) return &bpf_trace_printk_proto; } +#define MAX_SEQ_PRINTF_VARARGS 12 +#define MAX_SEQ_PRINTF_STR_LEN 128 + +BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size, + const void *, data, u32, data_len) +{ + char bufs[MAX_SEQ_PRINTF_VARARGS][MAX_SEQ_PRINTF_STR_LEN]; + u64 params[MAX_SEQ_PRINTF_VARARGS]; + int i, copy_size, num_args; + const u64 *args = data; + int fmt_cnt = 0; + + /* + * bpf_check()->check_func_arg()->check_stack_boundary() + * guarantees that fmt points to bpf program stack, + * fmt_size bytes of it were initialized and fmt_size > 0 + */ + if (fmt[--fmt_size] != 0) + return -EINVAL; + + if (data_len & 7) + return -EINVAL; + + for (i = 0; i < fmt_size; i++) { + if (fmt[i] == '%' && (!data || !data_len)) + return -EINVAL; + } + + num_args = data_len / 8; + + /* check format string for allowed specifiers */ + for (i = 0; i < fmt_size; i++) { + if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) + return -EINVAL; + + if (fmt[i] != '%') + continue; + + if (fmt_cnt >= MAX_SEQ_PRINTF_VARARGS) + return -E2BIG; + + if (fmt_cnt >= num_args) + return -EINVAL; + + /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */ + i++; + + /* skip optional "[0+-][num]" width formating field */ + while (fmt[i] == '0' || fmt[i] == '+' || fmt[i] == '-') + i++; + if (fmt[i] >= '1' && fmt[i] <= '9') { + i++; + while (fmt[i] >= '0' && fmt[i] <= '9') + i++; + } + + if (fmt[i] == 's') { + /* disallow any further format extensions */ + if (fmt[i + 1] != 0 && + !isspace(fmt[i + 1]) && + !ispunct(fmt[i + 1])) + return -EINVAL; + + /* try our best to copy */ + bufs[fmt_cnt][0] = 0; + strncpy_from_unsafe(bufs[fmt_cnt], + (void *) (long) args[fmt_cnt], + MAX_SEQ_PRINTF_STR_LEN); + params[fmt_cnt] = (u64)(long)bufs[fmt_cnt]; + + fmt_cnt++; + continue; + } + + if (fmt[i] == 'p') { + if (fmt[i + 1] == 0 || + fmt[i + 1] == 'K' || + fmt[i + 1] == 'x') { + /* just kernel pointers */ + params[fmt_cnt] = args[fmt_cnt]; + fmt_cnt++; + continue; + } + + /* only support "%pI4", "%pi4", "%pI6" and "pi6". */ + if (fmt[i + 1] != 'i' && fmt[i + 1] != 'I') + return -EINVAL; + if (fmt[i + 2] != '4' && fmt[i + 2] != '6') + return -EINVAL; + + copy_size = (fmt[i + 2] == '4') ? 4 : 16; + + /* try our best to copy */ + probe_kernel_read(bufs[fmt_cnt], + (void *) (long) args[fmt_cnt], copy_size); + params[fmt_cnt] = (u64)(long)bufs[fmt_cnt]; + + i += 2; + fmt_cnt++; + continue; + } + + if (fmt[i] == 'l') { + i++; + if (fmt[i] == 'l') + i++; + } + + if (fmt[i] != 'i' && fmt[i] != 'd' && + fmt[i] != 'u' && fmt[i] != 'x') + return -EINVAL; + + params[fmt_cnt] = args[fmt_cnt]; + fmt_cnt++; + } + + /* Maximumly we can have MAX_SEQ_PRINTF_VARARGS parameter, just give + * all of them to seq_printf(). + */ + seq_printf(m, fmt, params[0], params[1], params[2], params[3], + params[4], params[5], params[6], params[7], params[8], + params[9], params[10], params[11]); + + return seq_has_overflowed(m) ? -EOVERFLOW : 0; +} + +static int bpf_seq_printf_btf_ids[5]; +static const struct bpf_func_proto bpf_seq_printf_proto = { + .func = bpf_seq_printf, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_BTF_ID, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE, + .arg4_type = ARG_PTR_TO_MEM_OR_NULL, + .arg5_type = ARG_CONST_SIZE_OR_ZERO, + .btf_id = bpf_seq_printf_btf_ids, +}; + +BPF_CALL_3(bpf_seq_write, struct seq_file *, m, const void *, data, u32, len) +{ + return seq_write(m, data, len) ? -EOVERFLOW : 0; +} + +static int bpf_seq_write_btf_ids[5]; +static const struct bpf_func_proto bpf_seq_write_proto = { + .func = bpf_seq_write, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_BTF_ID, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE, + .btf_id = bpf_seq_write_btf_ids, +}; + static __always_inline int get_map_perf_counter(struct bpf_map *map, u64 flags, u64 *value, u64 *enabled, u64 *running) @@ -1226,6 +1381,10 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_xdp_output: return &bpf_xdp_output_proto; #endif + case BPF_FUNC_seq_printf: + return &bpf_seq_printf_proto; + case BPF_FUNC_seq_write: + return &bpf_seq_write_proto; default: return raw_tp_prog_func_proto(func_id, prog); } diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py index f43d193aff3a..ded304c96a05 100755 --- a/scripts/bpf_helpers_doc.py +++ b/scripts/bpf_helpers_doc.py @@ -414,6 +414,7 @@ class PrinterHelpers(Printer): 'struct sk_reuseport_md', 'struct sockaddr', 'struct tcphdr', + 'struct seq_file', 'struct __sk_buff', 'struct sk_msg_md', @@ -450,6 +451,7 @@ class PrinterHelpers(Printer): 'struct sk_reuseport_md', 'struct sockaddr', 'struct tcphdr', + 'struct seq_file', } mapped_types = { 'u8': '__u8', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 576651110d16..f0ab17d8fb73 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3042,6 +3042,33 @@ union bpf_attr { * See: clock_gettime(CLOCK_BOOTTIME) * Return * Current *ktime*. + * + * int bpf_seq_printf(struct seq_file *m, const char *fmt, u32 fmt_size, const void *data, u32 data_len) + * Description + * seq_printf uses seq_file seq_printf() to print out the format string. + * The *m* represents the seq_file. The *fmt* and *fmt_size* are for + * the format string itself. The *data* and *data_len* are format string + * arguments. The *data* are a u64 array and corresponding format string + * values are stored in the array. For strings and pointers where pointees + * are accessed, only the pointer values are stored in the *data* array. + * The *data_len* is the *data* size in term of bytes. + * Return + * 0 on success, or a negative errno in case of failure. + * + * * **-EINVAL** Invalid arguments, or invalid/unsupported formats. + * * **-E2BIG** Too many format specifiers. + * * **-ENOMEM** No enough memory to copy pointees or strings. + * * **-EOVERFLOW** Overflow happens, the same object will be tried again. + * + * int bpf_seq_write(struct seq_file *m, const void *data, u32 len) + * Description + * seq_write uses seq_file seq_write() to write the data. + * The *m* represents the seq_file. The *data* and *len* represent the + * data to write in bytes. + * Return + * 0 on success, or a negative errno in case of failure. + * + * * **-EOVERFLOW** Overflow happens, the same object will be tried again. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3169,7 +3196,9 @@ union bpf_attr { FN(get_netns_cookie), \ FN(get_current_ancestor_cgroup_id), \ FN(sk_assign), \ - FN(ktime_get_boot_ns), + FN(ktime_get_boot_ns), \ + FN(seq_printf), \ + FN(seq_write), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call From patchwork Mon Apr 27 20:12:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6500CC54FCB for ; Mon, 27 Apr 2020 20:13:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4BB7220728 for ; Mon, 27 Apr 2020 20:13:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="T+js0XXC" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726230AbgD0UN0 (ORCPT ); Mon, 27 Apr 2020 16:13:26 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:31444 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726849AbgD0UMy (ORCPT ); Mon, 27 Apr 2020 16:12:54 -0400 Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RKAoiN031128 for ; Mon, 27 Apr 2020 13:12:53 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=KoU+eBOgm84GYKFMCRf8IzLM26AAVlbV1/nM6UxF9MA=; b=T+js0XXCiRT48ajG3jDjOCVozwRq5BkoTaI/+kjtzDTS8IIgYD/MmtSltpzXRoYcBGq6 uaKrtwASr1BxhZlQsMYMN0a/Wl0mwsS5xMwzvUGSeTV39yjoF9ygBbHVq6L4SVQo7mJe uxRt53NIbHlLOd6XBkn4Rx7z8iUJnRUofL4= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 30n57q246k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:12:53 -0700 Received: from intmgw001.08.frc2.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:52 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id 4705C3700871; Mon, 27 Apr 2020 13:12:50 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 13/19] bpf: handle spilled PTR_TO_BTF_ID properly when checking stack_boundary Date: Mon, 27 Apr 2020 13:12:50 -0700 Message-ID: <20200427201250.2995815-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 suspectscore=0 lowpriorityscore=0 mlxscore=0 spamscore=0 phishscore=0 adultscore=0 clxscore=1015 impostorscore=0 malwarescore=0 mlxlogscore=999 priorityscore=1501 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270164 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This specifically to handle the case like below: // ptr below is a socket ptr identified by PTR_TO_BTF_ID u64 param[2] = { ptr, val }; bpf_seq_printf(seq, fmt, sizeof(fmt), param, sizeof(param)); In this case, the 16 bytes stack for "param" contains: 8 bytes for ptr with spilled PTR_TO_BTF_ID 8 bytes for val as STACK_MISC The current verifier will complain the ptr should not be visible to the helper. ... 16: (7b) *(u64 *)(r10 -64) = r2 18: (7b) *(u64 *)(r10 -56) = r1 19: (bf) r4 = r10 ; 20: (07) r4 += -64 ; BPF_SEQ_PRINTF(seq, fmt1, (long)s, s->sk_protocol); 21: (bf) r1 = r6 22: (18) r2 = 0xffffa8d00018605a 24: (b4) w3 = 10 25: (b4) w5 = 16 26: (85) call bpf_seq_printf#125 R0=inv(id=0) R1_w=ptr_seq_file(id=0,off=0,imm=0) R2_w=map_value(id=0,off=90,ks=4,vs=144,imm=0) R3_w=inv10 R4_w=fp-64 R5_w=inv16 R6=ptr_seq_file(id=0,off=0,imm=0) R7=ptr_netlink_sock(id=0,off=0,imm=0) R10=fp0 fp-56_w=mmmmmmmm fp-64_w=ptr_ last_idx 26 first_idx 13 regs=8 stack=0 before 25: (b4) w5 = 16 regs=8 stack=0 before 24: (b4) w3 = 10 invalid indirect read from stack off -64+0 size 16 Let us permit this if the program is a tracing/iter program. Signed-off-by: Yonghong Song --- kernel/bpf/verifier.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 21ec85e382ca..17a780e59f77 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3474,6 +3474,14 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, *stype = STACK_MISC; goto mark; } + + /* pointer value can be visible to tracing/iter program */ + if (env->prog->type == BPF_PROG_TYPE_TRACING && + env->prog->expected_attach_type == BPF_TRACE_ITER && + state->stack[spi].slot_type[0] == STACK_SPILL && + state->stack[spi].spilled_ptr.type == PTR_TO_BTF_ID) + goto mark; + if (state->stack[spi].slot_type[0] == STACK_SPILL && state->stack[spi].spilled_ptr.type == SCALAR_VALUE) { __mark_reg_unknown(env, &state->stack[spi].spilled_ptr); From patchwork Mon Apr 27 20:12:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8060AC4CECC for ; Mon, 27 Apr 2020 20:13:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 600CA20728 for ; Mon, 27 Apr 2020 20:13:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="MjI8PPEa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726893AbgD0UNB (ORCPT ); Mon, 27 Apr 2020 16:13:01 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:51394 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726881AbgD0UM5 (ORCPT ); Mon, 27 Apr 2020 16:12:57 -0400 Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RK1sOw011056 for ; Mon, 27 Apr 2020 13:12:55 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=e6sUKVJE4TRn+OgxVb/za2wlkWSbdOe4ULxFFsDU+0w=; b=MjI8PPEaCRGxJHuZp4PRnmiLpKg588K0StKgpbcXUq4lsb8xdRy2Ioj1h+hnrnxG02nj NfbZsTAH5kCMeGewF5NjLtLCsy88rAeRvZQuto9QG3dwKfiSu1Sg4ps9LvOzAH64CRzC zSHwFMSz9yx26Lrr9TjsvjrbIoXQVP+1sY4= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 30mk1gdyeq-14 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:12:55 -0700 Received: from intmgw002.08.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::d) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:12:53 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id B8AC73700871; Mon, 27 Apr 2020 13:12:52 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 15/19] tools/libbpf: add bpf_iter support Date: Mon, 27 Apr 2020 13:12:52 -0700 Message-ID: <20200427201252.2996037-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 impostorscore=0 suspectscore=2 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 mlxscore=0 malwarescore=0 bulkscore=0 spamscore=0 adultscore=0 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270161 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Three new libbpf APIs are added to support bpf_iter: - bpf_program__attach_iter Given a bpf program and additional parameters, which is none now, returns a bpf_link. - bpf_link__create_iter Given a bpf_link, create a bpf_iter and return a fd so user can then do read() to get seq_file output data. - bpf_iter_create syscall level API to create a bpf iterator. Two macros, BPF_SEQ_PRINTF0 and BPF_SEQ_PRINTF, are also introduced. These two macros can help bpf program writers with nicer bpf_seq_printf syntax similar to the kernel one. Signed-off-by: Yonghong Song --- tools/lib/bpf/bpf.c | 11 +++++++ tools/lib/bpf/bpf.h | 2 ++ tools/lib/bpf/bpf_tracing.h | 23 ++++++++++++++ tools/lib/bpf/libbpf.c | 60 +++++++++++++++++++++++++++++++++++++ tools/lib/bpf/libbpf.h | 11 +++++++ tools/lib/bpf/libbpf.map | 7 +++++ 6 files changed, 114 insertions(+) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 5cc1b0785d18..7ffd6c0ad95f 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -619,6 +619,17 @@ int bpf_link_update(int link_fd, int new_prog_fd, return sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr)); } +int bpf_iter_create(int link_fd, unsigned int flags) +{ + union bpf_attr attr; + + memset(&attr, 0, sizeof(attr)); + attr.iter_create.link_fd = link_fd; + attr.iter_create.flags = flags; + + return sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr)); +} + int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt) { diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 46d47afdd887..db9df303090e 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -187,6 +187,8 @@ struct bpf_link_update_opts { LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd, const struct bpf_link_update_opts *opts); +LIBBPF_API int bpf_iter_create(int link_fd, unsigned int flags); + struct bpf_prog_test_run_attr { int prog_fd; int repeat; diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h index f3f3c3fb98cb..4a6dffaa7e57 100644 --- a/tools/lib/bpf/bpf_tracing.h +++ b/tools/lib/bpf/bpf_tracing.h @@ -413,4 +413,27 @@ typeof(name(0)) name(struct pt_regs *ctx) \ } \ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args) +/* + * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values + * in a structure. BPF_SEQ_PRINTF0 is a simple wrapper for + * bpf_seq_printf(). + */ +#define BPF_SEQ_PRINTF0(seq, fmt) \ + ({ \ + int ret = bpf_seq_printf(seq, fmt, sizeof(fmt), \ + (void *)0, 0); \ + ret; \ + }) + +#define BPF_SEQ_PRINTF(seq, fmt, args...) \ + ({ \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ + __u64 param[___bpf_narg(args)] = { args }; \ + _Pragma("GCC diagnostic pop") \ + int ret = bpf_seq_printf(seq, fmt, sizeof(fmt), \ + param, sizeof(param)); \ + ret; \ + }) + #endif diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 8e1dc6980fac..ffdc4d8e0cc0 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6366,6 +6366,9 @@ static const struct bpf_sec_def section_defs[] = { .is_attach_btf = true, .expected_attach_type = BPF_LSM_MAC, .attach_fn = attach_lsm), + SEC_DEF("iter/", TRACING, + .expected_attach_type = BPF_TRACE_ITER, + .is_attach_btf = true), BPF_PROG_SEC("xdp", BPF_PROG_TYPE_XDP), BPF_PROG_SEC("perf_event", BPF_PROG_TYPE_PERF_EVENT), BPF_PROG_SEC("lwt_in", BPF_PROG_TYPE_LWT_IN), @@ -6629,6 +6632,7 @@ static int bpf_object__collect_struct_ops_map_reloc(struct bpf_object *obj, #define BTF_TRACE_PREFIX "btf_trace_" #define BTF_LSM_PREFIX "bpf_lsm_" +#define BTF_ITER_PREFIX "__bpf_iter__" #define BTF_MAX_NAME_SIZE 128 static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix, @@ -6659,6 +6663,9 @@ static inline int __find_vmlinux_btf_id(struct btf *btf, const char *name, else if (attach_type == BPF_LSM_MAC) err = find_btf_by_prefix_kind(btf, BTF_LSM_PREFIX, name, BTF_KIND_FUNC); + else if (attach_type == BPF_TRACE_ITER) + err = find_btf_by_prefix_kind(btf, BTF_ITER_PREFIX, name, + BTF_KIND_FUNC); else err = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC); @@ -7617,6 +7624,59 @@ bpf_program__attach_cgroup(struct bpf_program *prog, int cgroup_fd) return link; } +struct bpf_link * +bpf_program__attach_iter(struct bpf_program *prog, + const struct bpf_iter_attach_opts *opts) +{ + enum bpf_attach_type attach_type; + char errmsg[STRERR_BUFSIZE]; + struct bpf_link *link; + int prog_fd, link_fd; + + if (!OPTS_VALID(opts, bpf_iter_attach_opts)) + return ERR_PTR(-EINVAL); + + prog_fd = bpf_program__fd(prog); + if (prog_fd < 0) { + pr_warn("program '%s': can't attach before loaded\n", + bpf_program__title(prog, false)); + return ERR_PTR(-EINVAL); + } + + link = calloc(1, sizeof(*link)); + if (!link) + return ERR_PTR(-ENOMEM); + link->detach = &bpf_link__detach_fd; + + attach_type = bpf_program__get_expected_attach_type(prog); + link_fd = bpf_link_create(prog_fd, 0, attach_type, NULL); + if (link_fd < 0) { + link_fd = -errno; + free(link); + pr_warn("program '%s': failed to attach to iterator: %s\n", + bpf_program__title(prog, false), + libbpf_strerror_r(link_fd, errmsg, sizeof(errmsg))); + return ERR_PTR(link_fd); + } + link->fd = link_fd; + return link; +} + +int bpf_link__create_iter(struct bpf_link *link, unsigned int flags) +{ + char errmsg[STRERR_BUFSIZE]; + int iter_fd; + + iter_fd = bpf_iter_create(bpf_link__fd(link), flags); + if (iter_fd < 0) { + iter_fd = -errno; + pr_warn("failed to create an iterator: %s\n", + libbpf_strerror_r(iter_fd, errmsg, sizeof(errmsg))); + } + + return iter_fd; +} + struct bpf_link *bpf_program__attach(struct bpf_program *prog) { const struct bpf_sec_def *sec_def; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index f1dacecb1619..abe5786fcab3 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -258,6 +258,17 @@ struct bpf_map; LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map); +struct bpf_iter_attach_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ +}; +#define bpf_iter_attach_opts__last_field sz + +LIBBPF_API struct bpf_link * +bpf_program__attach_iter(struct bpf_program *prog, + const struct bpf_iter_attach_opts *opts); +LIBBPF_API int +bpf_link__create_iter(struct bpf_link *link, unsigned int flags); + struct bpf_insn; /* diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index bb8831605b25..1cea36f9f2e2 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -254,3 +254,10 @@ LIBBPF_0.0.8 { bpf_program__set_lsm; bpf_set_link_xdp_fd_opts; } LIBBPF_0.0.7; + +LIBBPF_0.0.9 { + global: + bpf_link__create_iter; + bpf_program__attach_iter; + bpf_iter_create; +} LIBBPF_0.0.8; From patchwork Mon Apr 27 20:12:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 678B2C4CECD for ; Mon, 27 Apr 2020 20:13:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B3B82075E for ; Mon, 27 Apr 2020 20:13:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="aqwaMEFU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726920AbgD0UNO (ORCPT ); Mon, 27 Apr 2020 16:13:14 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:36320 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726426AbgD0UND (ORCPT ); Mon, 27 Apr 2020 16:13:03 -0400 Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RK8FdZ022602 for ; Mon, 27 Apr 2020 13:13:02 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=ZjwHMp00doVzKyGyBQm5cBW9JKLcyAAnoLaRok4ayAY=; b=aqwaMEFU01+CG+DEI1YqBijM+pZ5XvkMrqcO2LI32QpIlV1DB20M8GbujwyFWBEdSCl3 LxqfHrjKj+hdpBN8MBff06ScswPn1Ernv6PiqzbOPUkPlaRlSd9uIJB7V96+l9vGEfib q3Zj4S6ueRl0DY+crqO6VZ9QeYjM8aPA5GY= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 30n54ea3nv-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:13:02 -0700 Received: from intmgw002.08.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:13:01 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id F32DB3700871; Mon, 27 Apr 2020 13:12:53 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 16/19] tools/bpftool: add bpf_iter support for bptool Date: Mon, 27 Apr 2020 13:12:53 -0700 Message-ID: <20200427201253.2996156-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 phishscore=0 priorityscore=1501 suspectscore=0 spamscore=0 impostorscore=0 mlxlogscore=999 bulkscore=0 mlxscore=0 adultscore=0 malwarescore=0 clxscore=1015 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270164 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Currently, only one command is supported bpftool iter pin It will pin the trace/iter bpf program in the object file to the where should be on a bpffs mount. For example, $ bpftool iter pin ./bpf_iter_ipv6_route.o \ /sys/fs/bpf/my_route User can then do a `cat` to print out the results: $ cat /sys/fs/bpf/my_route fe800000000000000000000000000000 40 00000000000000000000000000000000 ... 00000000000000000000000000000000 00 00000000000000000000000000000000 ... 00000000000000000000000000000001 80 00000000000000000000000000000000 ... fe800000000000008c0162fffebdfd57 80 00000000000000000000000000000000 ... ff000000000000000000000000000000 08 00000000000000000000000000000000 ... 00000000000000000000000000000000 00 00000000000000000000000000000000 ... The implementation for ipv6_route iterator is in one of subsequent patches. Signed-off-by: Yonghong Song --- .../bpftool/Documentation/bpftool-iter.rst | 71 ++++++++++++++++ tools/bpf/bpftool/bash-completion/bpftool | 13 +++ tools/bpf/bpftool/iter.c | 84 +++++++++++++++++++ tools/bpf/bpftool/main.c | 3 +- tools/bpf/bpftool/main.h | 1 + 5 files changed, 171 insertions(+), 1 deletion(-) create mode 100644 tools/bpf/bpftool/Documentation/bpftool-iter.rst create mode 100644 tools/bpf/bpftool/iter.c diff --git a/tools/bpf/bpftool/Documentation/bpftool-iter.rst b/tools/bpf/bpftool/Documentation/bpftool-iter.rst new file mode 100644 index 000000000000..1997a6bac4a0 --- /dev/null +++ b/tools/bpf/bpftool/Documentation/bpftool-iter.rst @@ -0,0 +1,71 @@ +============ +bpftool-iter +============ +------------------------------------------------------------------------------- +tool to create BPF iterators +------------------------------------------------------------------------------- + +:Manual section: 8 + +SYNOPSIS +======== + + **bpftool** [*OPTIONS*] **iter** *COMMAND* + + *COMMANDS* := { **pin** | **help** } + +STRUCT_OPS COMMANDS +=================== + +| **bpftool** **iter pin** *OBJ* *PATH* +| **bpftool** **struct_ops help** +| +| *OBJ* := /a/file/of/bpf_iter_target.o + + +DESCRIPTION +=========== + **bpftool iter pin** *OBJ* *PATH* + Create a bpf iterator from *OBJ*, and pin it to + *PATH*. The *PATH* should be located in *bpffs* mount. + + **bpftool struct_ops help** + Print short help message. + +OPTIONS +======= + -h, --help + Print short generic help message (similar to **bpftool help**). + + -V, --version + Print version number (similar to **bpftool version**). + + -d, --debug + Print all logs available, even debug-level information. This + includes logs from libbpf as well as from the verifier, when + attempting to load programs. + +EXAMPLES +======== +**# bpftool iter pin bpf_iter_netlink.o /sys/fs/bpf/my_netlink** + +:: + + Create a file-based bpf iterator from bpf_iter_netlink.o and pin it + to /sys/fs/bpf/my_netlink + + +SEE ALSO +======== + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool**\ (8), + **bpftool-prog**\ (8), + **bpftool-map**\ (8), + **bpftool-cgroup**\ (8), + **bpftool-feature**\ (8), + **bpftool-net**\ (8), + **bpftool-perf**\ (8), + **bpftool-btf**\ (8) + **bpftool-gen**\ (8) + **bpftool-struct_ops**\ (8) diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool index 45ee99b159e2..17a81695da0f 100644 --- a/tools/bpf/bpftool/bash-completion/bpftool +++ b/tools/bpf/bpftool/bash-completion/bpftool @@ -604,6 +604,19 @@ _bpftool() ;; esac ;; + iter) + case $command in + pin) + _filedir + return 0 + ;; + *) + [[ $prev == $object ]] && \ + COMPREPLY=( $( compgen -W 'help' \ + -- "$cur" ) ) + ;; + esac + ;; map) local MAP_TYPE='id pinned name' case $command in diff --git a/tools/bpf/bpftool/iter.c b/tools/bpf/bpftool/iter.c new file mode 100644 index 000000000000..db9fae6be716 --- /dev/null +++ b/tools/bpf/bpftool/iter.c @@ -0,0 +1,84 @@ +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +// Copyright (C) 2020 Facebook + +#define _GNU_SOURCE +#include +#include + +#include "main.h" + +static int do_pin(int argc, char **argv) +{ + const char *objfile, *path; + struct bpf_program *prog; + struct bpf_object *obj; + struct bpf_link *link; + int err; + + if (!REQ_ARGS(2)) + usage(); + + objfile = GET_ARG(); + path = GET_ARG(); + + obj = bpf_object__open(objfile); + if (IS_ERR_OR_NULL(obj)) { + p_err("can't open objfile %s", objfile); + return -1; + } + + err = bpf_object__load(obj); + if (err < 0) { + err = -1; + p_err("can't load objfile %s", objfile); + goto close_obj; + } + + prog = bpf_program__next(NULL, obj); + link = bpf_program__attach_iter(prog, NULL); + if (IS_ERR(link)) { + err = -1; + p_err("attach_iter failed for program %s", + bpf_program__name(prog)); + goto close_obj; + } + + err = bpf_link__pin(link, path); + if (err) { + err = -1; + p_err("pin_iter failed for program %s to path %s", + bpf_program__name(prog), path); + goto close_link; + } + + err = 0; + +close_link: + bpf_link__disconnect(link); + bpf_link__destroy(link); +close_obj: + bpf_object__close(obj); + return err; +} + +static int do_help(int argc, char **argv) +{ + fprintf(stderr, + "Usage: %s %s pin OBJ PATH\n" + " %s %s help\n" + "\n", + bin_name, argv[-2], bin_name, argv[-2]); + + return 0; +} + +static const struct cmd cmds[] = { + { "help", do_help }, + { "pin", do_pin }, + { 0 } +}; + +int do_iter(int argc, char **argv) +{ + return cmd_select(cmds, argc, argv, do_help); +} diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c index 466c269eabdd..6805b77789cb 100644 --- a/tools/bpf/bpftool/main.c +++ b/tools/bpf/bpftool/main.c @@ -58,7 +58,7 @@ static int do_help(int argc, char **argv) " %s batch file FILE\n" " %s version\n" "\n" - " OBJECT := { prog | map | cgroup | perf | net | feature | btf | gen | struct_ops }\n" + " OBJECT := { prog | map | cgroup | perf | net | feature | btf | gen | struct_ops | iter }\n" " " HELP_SPEC_OPTIONS "\n" "", bin_name, bin_name, bin_name); @@ -222,6 +222,7 @@ static const struct cmd cmds[] = { { "btf", do_btf }, { "gen", do_gen }, { "struct_ops", do_struct_ops }, + { "iter", do_iter }, { "version", do_version }, { 0 } }; diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index 86f14ce26fd7..2b5d4a616b48 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -162,6 +162,7 @@ int do_feature(int argc, char **argv); int do_btf(int argc, char **argv); int do_gen(int argc, char **argv); int do_struct_ops(int argc, char **argv); +int do_iter(int argc, char **argv); int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what); int prog_parse_fd(int *argc, char ***argv); From patchwork Mon Apr 27 20:12:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 220467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD41BC54EEB for ; Mon, 27 Apr 2020 20:13:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C412120728 for ; Mon, 27 Apr 2020 20:13:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="WdF6aOkd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726906AbgD0UNH (ORCPT ); Mon, 27 Apr 2020 16:13:07 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:29080 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726898AbgD0UNF (ORCPT ); Mon, 27 Apr 2020 16:13:05 -0400 Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03RK1sP0011056 for ; Mon, 27 Apr 2020 13:13:03 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=1TdZKGzFnIGS1yKpHw1RLZ8PyNlKJK1CkMAjI4pyu1E=; b=WdF6aOkdXudPyHVppg5orJNhJQrItvR7czK6ZbapraiC1vWDtXrflBGgMhhn3P99jmS6 njT7AVsmZfHfwiWikvcCUUwFeKsCTBhcL3rFGCndWDUGZAlQXskk9FitPPyoiOK4x/mj WbG1Dlxixmj21o3JfqcDm6HoWTOWaXDvFeY= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 30mk1gdyfs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 27 Apr 2020 13:13:03 -0700 Received: from intmgw001.03.ash8.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 27 Apr 2020 13:13:02 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id 721643700871; Mon, 27 Apr 2020 13:12:56 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig003.ftw2.facebook.com To: Andrii Nakryiko , , Martin KaFai Lau , CC: Alexei Starovoitov , Daniel Borkmann , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH bpf-next v1 18/19] tools/bpf: selftests: add iter progs for bpf_map/task/task_file Date: Mon, 27 Apr 2020 13:12:56 -0700 Message-ID: <20200427201256.2996274-1-yhs@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200427201235.2994549-1-yhs@fb.com> References: <20200427201235.2994549-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_15:2020-04-27,2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 impostorscore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 mlxscore=0 malwarescore=0 bulkscore=0 spamscore=0 adultscore=0 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270161 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The implementation is arbitrary, just to show how the bpf programs can be written for bpf_map/task/task_file. They can be costomized for specific needs. For example, for bpf_map, the iterator prints out: $ cat /sys/fs/bpf/my_bpf_map id refcnt usercnt locked_vm 3 2 0 20 6 2 0 20 9 2 0 20 12 2 0 20 13 2 0 20 16 2 0 20 19 2 0 20 === END === For task, the iterator prints out: $ cat /sys/fs/bpf/my_task tgid gid 1 1 2 2 .... 1944 1944 1948 1948 1949 1949 1953 1953 === END === For task/file, the iterator prints out: $ cat /sys/fs/bpf/my_task_file tgid gid fd file 1 1 0 ffffffff95c97600 1 1 1 ffffffff95c97600 1 1 2 ffffffff95c97600 .... 1895 1895 255 ffffffff95c8fe00 1932 1932 0 ffffffff95c8fe00 1932 1932 1 ffffffff95c8fe00 1932 1932 2 ffffffff95c8fe00 1932 1932 3 ffffffff95c185c0 This is able to print out all open files (fd and file->f_op), so user can compare f_op against a particular kernel file operations to find what it is. For example, from /proc/kallsyms, we can find ffffffff95c185c0 r eventfd_fops so we will know tgid 1932 fd 3 is an eventfd file descriptor. Signed-off-by: Yonghong Song --- .../selftests/bpf/progs/bpf_iter_bpf_map.c | 32 +++++++++++++++++++ .../selftests/bpf/progs/bpf_iter_task.c | 29 +++++++++++++++++ .../selftests/bpf/progs/bpf_iter_task_file.c | 28 ++++++++++++++++ 3 files changed, 89 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_file.c diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c new file mode 100644 index 000000000000..d4973ba4f337 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c @@ -0,0 +1,32 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include "vmlinux.h" +#include +#include +#include + +char _license[] SEC("license") = "GPL"; + +SEC("iter/bpf_map") +int dump_bpf_map(struct bpf_iter__bpf_map *ctx) +{ + static const char banner[] = " id refcnt usercnt locked_vm\n"; + static const char footer[] = " === END ===\n"; + static const char fmt[] = "%8u %8ld %8ld %10lu\n"; + struct seq_file *seq = ctx->meta->seq; + __u64 seq_num = ctx->meta->seq_num; + struct bpf_map *map = ctx->map; + + if (map == (void *)0) { + BPF_SEQ_PRINTF0(seq, footer); + return 0; + } + + if (seq_num == 0) + BPF_SEQ_PRINTF0(seq, banner); + + BPF_SEQ_PRINTF(seq, fmt, map->id, map->refcnt.counter, + map->usercnt.counter, + map->memory.user->locked_vm.counter); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task.c b/tools/testing/selftests/bpf/progs/bpf_iter_task.c new file mode 100644 index 000000000000..78583dda3739 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task.c @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include "vmlinux.h" +#include +#include +#include + +char _license[] SEC("license") = "GPL"; + +SEC("iter/task") +int dump_tasks(struct bpf_iter__task *ctx) +{ + static char const banner[] = " tgid gid\n"; + static char const footer[] = "=== END ===\n"; + static char const fmt[] = "%8d %8d\n"; + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + + if (task == (void *)0) { + BPF_SEQ_PRINTF0(seq, footer); + return 0; + } + + if (ctx->meta->seq_num == 0) + BPF_SEQ_PRINTF0(seq, banner); + + BPF_SEQ_PRINTF(seq, fmt, task->tgid, task->pid); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c new file mode 100644 index 000000000000..7ade0303a1a5 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c @@ -0,0 +1,28 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include "vmlinux.h" +#include +#include +#include + +char _license[] SEC("license") = "GPL"; + +SEC("iter/task_file") +int dump_tasks(struct bpf_iter__task_file *ctx) +{ + static char const banner[] = " tgid gid fd file\n"; + static char const fmt[] = "%8d %8d %8d %lx\n"; + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + __u32 fd = ctx->fd; + struct file *file = ctx->file; + + if (task == (void *)0 || file == (void *)0) + return 0; + + if (ctx->meta->seq_num == 0) + BPF_SEQ_PRINTF0(seq, banner); + + BPF_SEQ_PRINTF(seq, fmt, task->tgid, task->pid, fd, (long)file->f_op); + return 0; +}