From patchwork Thu Jul 8 01:18:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 472260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEAB5C07E9C for ; Thu, 8 Jul 2021 01:18:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B77B661206 for ; Thu, 8 Jul 2021 01:18:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230192AbhGHBVZ (ORCPT ); Wed, 7 Jul 2021 21:21:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230106AbhGHBVV (ORCPT ); Wed, 7 Jul 2021 21:21:21 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BAA0C061574; Wed, 7 Jul 2021 18:18:39 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id h1-20020a17090a3d01b0290172d33bb8bcso4875593pjc.0; Wed, 07 Jul 2021 18:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bGv8lt8RNeoYZwAKXokdTr6gAVZv4YRPbRC8p2kgseY=; b=rYCUSmOf/IelQ/rcTCFbPfWkLsRHlpb4lFh94Opbdm+YlN7Ai1d/YQvQDmZJo6JzOX IkQ9nEH9a0TLmOpXQwuDgxYeqYjpG6hrJnDAv9gurpoepztepeQotla7HbpWeIdOVYYd yJ8oZThO4Ld05YI1qCXcKdqhRUsZ1tvHpp7B3UgYWVr7yBIeUO1GELX+sfUpB7jiAbXk FRJ2/SddgvRUTGpPUCEErmHiUqLVcNdR3z8VRxdewoGhRNZJO9nyFQU3L8LMjd83EeUC I38NrN70fbT6vaSR9MEEL3pkgakWqmcAgTSOatuCfcgsSAUcO6NhuX6HDtOEg8qEVawq Q+4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bGv8lt8RNeoYZwAKXokdTr6gAVZv4YRPbRC8p2kgseY=; b=YJPtAJPFHjPNnDIV5YjmZ2/5HGgYXqPX4GGTam1wfCUTsEn0C+gWyUZw1XKVRDxSE4 rGaAe7gjaMtgB5T+q5Y236wCRRgE2U/FWhszjquoWlGKzjV3HbufRHelZS+7KOnDxE4H sjGpovxJwPEPKtd6cBzbyHaHdJEGn/2SfpV5gjakFS0lRdCjmO0SVeD/OfX/7iNQjw/V ec99GOE4pWl0DKhy9epf347Y7JLfYBiSXKTBrGKIPRlYjhwDBAwZdd6qYqZ2oyNxZ1vZ vQ5oy/IFL5gipyw3NoTiSQ7voRBQ/2p4Zj1N4Q94QKpL1j+hgRP4loICl3I3a5ayYZ4N qEIg== X-Gm-Message-State: AOAM533D64IsOwyGjBMdWrPaGHKy168VcF5d6pXgFzS848xlsM0hkH0F WY6znw75iXrSUBKNpCkAqejmScmZqlk= X-Google-Smtp-Source: ABdhPJwabfMSPTbWecpunDLS+PRr/U1kalg3dSuzDKCUxrnuiRd6yMx8VZn7ED0v7Z3mdk3TQHAxkQ== X-Received: by 2002:a17:902:b94a:b029:10d:6f56:eeac with SMTP id h10-20020a170902b94ab029010d6f56eeacmr23879819pls.54.1625707119133; Wed, 07 Jul 2021 18:18:39 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:38 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 01/11] bpf: Prepare bpf_prog_put() to be called from irq context. Date: Wed, 7 Jul 2021 18:18:23 -0700 Message-Id: <20210708011833.67028-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Currently bpf_prog_put() is called from the task context only. With addition of bpf timers the timer related helpers will start calling bpf_prog_put() from irq-saved region and in rare cases might drop the refcnt to zero. To address this case, first, convert bpf_prog_free_id() to be irq-save (this is similar to bpf_map_free_id), and, second, defer non irq appropriate calls into work queue. For example: bpf_audit_prog() is calling kmalloc and wake_up_interruptible, bpf_prog_kallsyms_del_all()->bpf_ksym_del()->spin_unlock_bh(). They are not safe with irqs disabled. Signed-off-by: Alexei Starovoitov --- kernel/bpf/syscall.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index e343f158e556..5d1fee634be8 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1699,6 +1699,8 @@ static int bpf_prog_alloc_id(struct bpf_prog *prog) void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) { + unsigned long flags; + /* cBPF to eBPF migrations are currently not in the idr store. * Offloaded programs are removed from the store when their device * disappears - even if someone grabs an fd to them they are unusable, @@ -1708,7 +1710,7 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) return; if (do_idr_lock) - spin_lock_bh(&prog_idr_lock); + spin_lock_irqsave(&prog_idr_lock, flags); else __acquire(&prog_idr_lock); @@ -1716,7 +1718,7 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) prog->aux->id = 0; if (do_idr_lock) - spin_unlock_bh(&prog_idr_lock); + spin_unlock_irqrestore(&prog_idr_lock, flags); else __release(&prog_idr_lock); } @@ -1752,14 +1754,32 @@ static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred) } } +static void bpf_prog_put_deferred(struct work_struct *work) +{ + struct bpf_prog_aux *aux; + struct bpf_prog *prog; + + aux = container_of(work, struct bpf_prog_aux, work); + prog = aux->prog; + perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0); + bpf_audit_prog(prog, BPF_AUDIT_UNLOAD); + __bpf_prog_put_noref(prog, true); +} + static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock) { - if (atomic64_dec_and_test(&prog->aux->refcnt)) { - perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0); - bpf_audit_prog(prog, BPF_AUDIT_UNLOAD); + struct bpf_prog_aux *aux = prog->aux; + + if (atomic64_dec_and_test(&aux->refcnt)) { /* bpf_prog_free_id() must be called first */ bpf_prog_free_id(prog, do_idr_lock); - __bpf_prog_put_noref(prog, true); + + if (in_irq() || irqs_disabled()) { + INIT_WORK(&aux->work, bpf_prog_put_deferred); + schedule_work(&aux->work); + } else { + bpf_prog_put_deferred(&aux->work); + } } } From patchwork Thu Jul 8 01:18:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 471691 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B935C07E9C for ; Thu, 8 Jul 2021 01:18:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E800461CC4 for ; Thu, 8 Jul 2021 01:18:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230106AbhGHBV0 (ORCPT ); Wed, 7 Jul 2021 21:21:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230164AbhGHBVX (ORCPT ); Wed, 7 Jul 2021 21:21:23 -0400 Received: from mail-pf1-x435.google.com (mail-pf1-x435.google.com [IPv6:2607:f8b0:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C672FC061574; Wed, 7 Jul 2021 18:18:41 -0700 (PDT) Received: by mail-pf1-x435.google.com with SMTP id a127so3939848pfa.10; Wed, 07 Jul 2021 18:18:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=f/cAzH+JLmeLEhMLTlyF7CnPirqTReOvrEHg/ibfezQ=; b=eytnCZVw8xhgAWrBPYTbkJ+na0mkKRMPVtooaVK+KYaN8XfcCLJQ5vLTQmGG4JfdJC pCMuhNOBoW1+R9Ez6NZg2C0mdvPgzcpHSIzuGWGWxamJjLw20tPo5+Cuah7oY0WeCqPI DVXnJqPzOaA2N2I8qdkWD3XNogvBnyAyI9YBwZkGUcNKVvSk394siwBNdvauftlbWtUW HRD0YmoLAN4VQAwrBs/DGbRVd3OvyOD5zyZwCe/9RAYRdU/7qHyjSztxrvmbJH+d49gQ gUBVGIZRveYXsK8z4Spix4+23nuMQRVDJSQYJur45DLnzjC9gaA1+uW7SlSb6LHSMnYm 6CVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f/cAzH+JLmeLEhMLTlyF7CnPirqTReOvrEHg/ibfezQ=; b=WgYqxjebGb2+8sEyiYIp8UpHZClNn0qfyRUZJ46eMP/b4M7xOaLcFZ8OdjZiPQ2Wvv nvdyFNGTZWcg/HpN/x0M2tkTnZdMn9qWpge+eqrfhV/GlfWWnnlmQBVyfLcRRlQOzctK 7EP2nmcCUB5eVO27QSyT6Ia4mCUE7G6LI7w/fxbq3Fw4MX9M145VgsAhJ92GE5mCbIJY AdRWN02EDPoVwUws0t3Gz9oLgSh3L7lFNVmURHXqQX/TOUxhBPrzPm1WUxeMmT3WrCmf N4hnXtKzX3Q9uDxtN5zRjWNJHbE8xX30GocN4wDBkqA+36bbYj7gm5gl5fSbIl8KdCpX /ISA== X-Gm-Message-State: AOAM5304Fp68kj2IrSIWMFNbMH6LohcfVmJCHfsVJc54wAJ4mNyLXx7W WbE7WYcn3Qks1hCp42b2ijM= X-Google-Smtp-Source: ABdhPJwUtojjhz+CP2LZBTjqQRpXUPwut0WddI5yV4Q41BY7DgrBTTi7nraEdRUkG6UDZOsE2r2FEA== X-Received: by 2002:a05:6a00:2c3:b029:317:874a:391c with SMTP id b3-20020a056a0002c3b0290317874a391cmr28098152pft.5.1625707120827; Wed, 07 Jul 2021 18:18:40 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.39 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:40 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 02/11] bpf: Factor out bpf_spin_lock into helpers. Date: Wed, 7 Jul 2021 18:18:24 -0700 Message-Id: <20210708011833.67028-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Move ____bpf_spin_lock/unlock into helpers to make it more clear that quadruple underscore bpf_spin_lock/unlock are irqsave/restore variants. Signed-off-by: Alexei Starovoitov --- kernel/bpf/helpers.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 62cf00383910..38be3cfc2f58 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -289,13 +289,18 @@ static inline void __bpf_spin_unlock(struct bpf_spin_lock *lock) static DEFINE_PER_CPU(unsigned long, irqsave_flags); -notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock) +static inline void __bpf_spin_lock_irqsave(struct bpf_spin_lock *lock) { unsigned long flags; local_irq_save(flags); __bpf_spin_lock(lock); __this_cpu_write(irqsave_flags, flags); +} + +notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock) +{ + __bpf_spin_lock_irqsave(lock); return 0; } @@ -306,13 +311,18 @@ const struct bpf_func_proto bpf_spin_lock_proto = { .arg1_type = ARG_PTR_TO_SPIN_LOCK, }; -notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock) +static inline void __bpf_spin_unlock_irqrestore(struct bpf_spin_lock *lock) { unsigned long flags; flags = __this_cpu_read(irqsave_flags); __bpf_spin_unlock(lock); local_irq_restore(flags); +} + +notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock) +{ + __bpf_spin_unlock_irqrestore(lock); return 0; } @@ -333,9 +343,9 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, else lock = dst + map->spin_lock_off; preempt_disable(); - ____bpf_spin_lock(lock); + __bpf_spin_lock_irqsave(lock); copy_map_value(map, dst, src); - ____bpf_spin_unlock(lock); + __bpf_spin_unlock_irqrestore(lock); preempt_enable(); } From patchwork Thu Jul 8 01:18:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 472259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A5CEC07E9E for ; Thu, 8 Jul 2021 01:18:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F422161CD0 for ; Thu, 8 Jul 2021 01:18:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230164AbhGHBV0 (ORCPT ); Wed, 7 Jul 2021 21:21:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230170AbhGHBVY (ORCPT ); Wed, 7 Jul 2021 21:21:24 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36D0CC06175F; Wed, 7 Jul 2021 18:18:43 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id y4so1734657pgl.10; Wed, 07 Jul 2021 18:18:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aT3D+eJIKU7CbK9GSHdMo+RILfnJgvRhtKyUBzmV3Bk=; b=bJ4o269WE1NHH/MhiB6fau84Q0+vNdVK5PiLUYFGYZhNQelh1grdhB6otuisxllSUw 31nu6zmUb3pdH0L8b/G0pKW+YmcHFe+bZNPTZrVbEkH5wVd/+Gmnucdt8HomwA/jM/0j zroKHKFdE6D68HIjev91WsM7hLgMY/aVdZDJtZtXBTkoAvSJBboDwuh29mq/VP22XwXR 5fPtziOQ+w/CtS27VyRri40lZMIu6wTjYWyo5ICvbE/OG0no7vad/sUDc4eWzfjuhDRC rHcMO5FfgkDhM0AUiN4o/E5gCpUETh+GwDJ5bdno3EPnFvQTPwp6GQgpi8uQTpxRx6Nj I8Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aT3D+eJIKU7CbK9GSHdMo+RILfnJgvRhtKyUBzmV3Bk=; b=f3QlaJ1WNI8I9HShFFIg063qLrCPoA9S1km68D718ntnNNcPqddyP4O9n3d9OOuCQo 8MmhJj3SJzg7WwWauAVHmBJHRdNAYYRE/hqRR9gydbGEUzkJPWKU6yS4WnsbG0z4RyLH 26JP3KzOSj1wf5ddF9Xb1zwc2Z4/pxz6L/TVasLE9lR81J6u0QaJ88ay+diCbL7qEJIc Plwjc+9UqKHcnCbQAzbN7nBclUvl8VYY3avJK07lOZswBG+tWqQqW6J9IiW885981J/h hYhgprKSPZR2nzvpabH5j557CzF5Z81IFGkg8s5bixmZWdyGTE6Sf+OitaXUpv5eRn9F a8qA== X-Gm-Message-State: AOAM532YrXhhLxXZOn+huSUIv5DrqJzsc3SUPgZYIu1sWgSLYDbGoqYw HXO76yYAIrr58YmdabV9ZpI= X-Google-Smtp-Source: ABdhPJwO0aKGIH+NtPz22alCnpyvTu3vduz5GOlSd9rXBrxejYyz6yGBrsno4TXpBUQ8iFscfWRz7w== X-Received: by 2002:a05:6a00:b4b:b029:31d:c915:181f with SMTP id p11-20020a056a000b4bb029031dc915181fmr20620966pfo.13.1625707122505; Wed, 07 Jul 2021 18:18:42 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.40 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:41 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 03/11] bpf: Introduce bpf timers. Date: Wed, 7 Jul 2021 18:18:25 -0700 Message-Id: <20210708011833.67028-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded in hash/array/lru maps as a regular field and helpers to operate on it: // Initialize the timer. // First 4 bits of 'flags' specify clockid. // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed. long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags); // Configure the timer to call 'callback_fn' static function. long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn); // Arm the timer to expire 'nsec' nanoseconds from the current time. long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags); // Cancel the timer and wait for callback_fn to finish if it was running. long bpf_timer_cancel(struct bpf_timer *timer); Here is how BPF program might look like: struct map_elem { int counter; struct bpf_timer timer; }; struct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 1000); __type(key, int); __type(value, struct map_elem); } hmap SEC(".maps"); static int timer_cb(void *map, int *key, struct map_elem *val); /* val points to particular map element that contains bpf_timer. */ SEC("fentry/bpf_fentry_test1") int BPF_PROG(test1, int a) { struct map_elem *val; int key = 0; val = bpf_map_lookup_elem(&hmap, &key); if (val) { bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME); bpf_timer_set_callback(&val->timer, timer_cb); bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0); } } This patch adds helper implementations that rely on hrtimers to call bpf functions as timers expire. The following patches add necessary safety checks. Only programs with CAP_BPF are allowed to use bpf_timer. The amount of timers used by the program is constrained by the memcg recorded at map creation time. The bpf_timer_init() helper needs explicit 'map' argument because inner maps are dynamic and not known at load time. While the bpf_timer_set_callback() is receiving hidden 'aux->prog' argument supplied by the verifier. The prog pointer is needed to do refcnting of bpf program to make sure that program doesn't get freed while the timer is armed. This approach relies on "user refcnt" scheme used in prog_array that stores bpf programs for bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is paired with bpf_timer_cancel() that will drop the prog refcnt. The ops->map_release_uref is responsible for cancelling the timers and dropping prog refcnt when user space reference to a map reaches zero. This uref approach is done to make sure that Ctrl-C of user space process will not leave timers running forever unless the user space explicitly pinned a map that contained timers in bpffs. bpf_timer_set_callback() will return -EPERM if map doesn't have user references (is not held by open file descriptor from user space and not pinned in bpffs). The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel and free the timer if given map element had it allocated. "bpftool map update" command can be used to cancel timers. The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because '__u64 :64' has 1 byte alignment of 8 byte padding. Signed-off-by: Alexei Starovoitov --- include/linux/bpf.h | 3 + include/uapi/linux/bpf.h | 69 +++++++ kernel/bpf/helpers.c | 318 +++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 109 +++++++++++ kernel/trace/bpf_trace.c | 2 +- scripts/bpf_doc.py | 2 + tools/include/uapi/linux/bpf.h | 69 +++++++ 7 files changed, 571 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f309fc1509f2..72da9d4d070c 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -168,6 +168,7 @@ struct bpf_map { u32 max_entries; u32 map_flags; int spin_lock_off; /* >=0 valid offset, <0 error */ + int timer_off; /* >=0 valid offset, <0 error */ u32 id; int numa_node; u32 btf_key_type_id; @@ -221,6 +222,7 @@ static inline void copy_map_value(struct bpf_map *map, void *dst, void *src) } void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, bool lock_src); +void bpf_timer_cancel_and_free(void *timer); int bpf_obj_name_cpy(char *dst, const char *src, unsigned int size); struct bpf_offload_dev; @@ -314,6 +316,7 @@ enum bpf_arg_type { ARG_PTR_TO_FUNC, /* pointer to a bpf program function */ ARG_PTR_TO_STACK_OR_NULL, /* pointer to stack or NULL */ ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */ + ARG_PTR_TO_TIMER, /* pointer to bpf_timer */ __BPF_ARG_TYPE_MAX, }; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index bf9252c7381e..4993baaa65be 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4780,6 +4780,66 @@ union bpf_attr { * Execute close syscall for given FD. * Return * A syscall result. + * + * long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, u64 flags) + * Description + * Initialize the timer. + * First 4 bits of *flags* specify clockid. + * Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed. + * All other bits of *flags* are reserved. + * The verifier will reject the program if *timer* is not from + * the same *map*. + * Return + * 0 on success. + * **-EBUSY** if *timer* is already initialized. + * **-EINVAL** if invalid *flags* are passed. + * + * long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn) + * Description + * Configure the timer to call *callback_fn* static function. + * Return + * 0 on success. + * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier. + * **-EPERM** if *timer* is in a map that doesn't have any user references. + * The user space should either hold a file descriptor to a map with timers + * or pin such map in bpffs. When map is unpinned or file descriptor is + * closed all timers in the map will be cancelled and freed. + * + * long bpf_timer_start(struct bpf_timer *timer, u64 nsecs, u64 flags) + * Description + * Set timer expiration N nanoseconds from the current time. The + * configured callback will be invoked in soft irq context on some cpu + * and will not repeat unless another bpf_timer_start() is made. + * In such case the next invocation can migrate to a different cpu. + * Since struct bpf_timer is a field inside map element the map + * owns the timer. The bpf_timer_set_callback() will increment refcnt + * of BPF program to make sure that callback_fn code stays valid. + * When user space reference to a map reaches zero all timers + * in a map are cancelled and corresponding program's refcnts are + * decremented. This is done to make sure that Ctrl-C of a user + * process doesn't leave any timers running. If map is pinned in + * bpffs the callback_fn can re-arm itself indefinitely. + * bpf_map_update/delete_elem() helpers and user space sys_bpf commands + * cancel and free the timer in the given map element. + * The map can contain timers that invoke callback_fn-s from different + * programs. The same callback_fn can serve different timers from + * different maps if key/value layout matches across maps. + * Every bpf_timer_set_callback() can have different callback_fn. + * + * Return + * 0 on success. + * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier + * or invalid *flags* are passed. + * + * long bpf_timer_cancel(struct bpf_timer *timer) + * Description + * Cancel the timer and wait for callback_fn to finish if it was running. + * Return + * 0 if the timer was not active. + * 1 if the timer was active. + * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier. + * **-EDEADLK** if callback_fn tried to call bpf_timer_cancel() on its + * own timer which would have led to a deadlock otherwise. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4951,6 +5011,10 @@ union bpf_attr { FN(sys_bpf), \ FN(btf_find_by_name_kind), \ FN(sys_close), \ + FN(timer_init), \ + FN(timer_set_callback), \ + FN(timer_start), \ + FN(timer_cancel), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -6077,6 +6141,11 @@ struct bpf_spin_lock { __u32 val; }; +struct bpf_timer { + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + struct bpf_sysctl { __u32 write; /* Sysctl is being read (= 0) or written (= 1). * Allows 1,2,4-byte read, but no write. diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 38be3cfc2f58..8f44b8c23543 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -999,6 +999,316 @@ const struct bpf_func_proto bpf_snprintf_proto = { .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; +/* BPF map elements can contain 'struct bpf_timer'. + * Such map owns all of its BPF timers. + * 'struct bpf_timer' is allocated as part of map element allocation + * and it's zero initialized. + * That space is used to keep 'struct bpf_timer_kern'. + * bpf_timer_init() allocates 'struct bpf_hrtimer', inits hrtimer, and + * remembers 'struct bpf_map *' pointer it's part of. + * bpf_timer_set_callback() increments prog refcnt and assign bpf callback_fn. + * bpf_timer_start() arms the timer. + * If user space reference to a map goes to zero at this point + * ops->map_release_uref callback is responsible for cancelling the timers, + * freeing their memory, and decrementing prog's refcnts. + * bpf_timer_cancel() cancels the timer and decrements prog's refcnt. + * Inner maps can contain bpf timers as well. ops->map_release_uref is + * freeing the timers when inner map is replaced or deleted by user space. + */ +struct bpf_hrtimer { + struct hrtimer timer; + struct bpf_map *map; + struct bpf_prog *prog; + void __rcu *callback_fn; + void *value; +}; + +/* the actual struct hidden inside uapi struct bpf_timer */ +struct bpf_timer_kern { + struct bpf_hrtimer *timer; + /* bpf_spin_lock is used here instead of spinlock_t to make + * sure that it always fits into space resereved by struct bpf_timer + * regardless of LOCKDEP and spinlock debug flags. + */ + struct bpf_spin_lock lock; +} __attribute__((aligned(8))); + +static DEFINE_PER_CPU(struct bpf_hrtimer *, hrtimer_running); + +static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer) +{ + struct bpf_hrtimer *t = container_of(hrtimer, struct bpf_hrtimer, timer); + struct bpf_map *map = t->map; + void *value = t->value; + void *callback_fn; + void *key; + u32 idx; + int ret; + + callback_fn = rcu_dereference_check(t->callback_fn, rcu_read_lock_bh_held()); + if (!callback_fn) + goto out; + + /* bpf_timer_cb() runs in hrtimer_run_softirq. It doesn't migrate and + * cannot be preempted by another bpf_timer_cb() on the same cpu. + * Remember the timer this callback is servicing to prevent + * deadlock if callback_fn() calls bpf_timer_cancel() or + * bpf_map_delete_elem() on the same timer. + */ + this_cpu_write(hrtimer_running, t); + if (map->map_type == BPF_MAP_TYPE_ARRAY) { + struct bpf_array *array = container_of(map, struct bpf_array, map); + + /* compute the key */ + idx = ((char *)value - array->value) / array->elem_size; + key = &idx; + } else { /* hash or lru */ + key = value - round_up(map->key_size, 8); + } + + ret = BPF_CAST_CALL(callback_fn)((u64)(long)map, + (u64)(long)key, + (u64)(long)value, 0, 0); + WARN_ON(ret != 0); /* Next patch moves this check into the verifier */ + + this_cpu_write(hrtimer_running, NULL); +out: + return HRTIMER_NORESTART; +} + +BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map, + u64, flags) +{ + clockid_t clockid = flags & (MAX_CLOCKS - 1); + struct bpf_hrtimer *t; + int ret = 0; + + BUILD_BUG_ON(MAX_CLOCKS != 16); + BUILD_BUG_ON(sizeof(struct bpf_timer_kern) > sizeof(struct bpf_timer)); + BUILD_BUG_ON(__alignof__(struct bpf_timer_kern) != __alignof__(struct bpf_timer)); + + if (in_nmi()) + return -EOPNOTSUPP; + + if (flags >= MAX_CLOCKS || + /* similar to timerfd except _ALARM variants are not supported */ + (clockid != CLOCK_MONOTONIC && + clockid != CLOCK_REALTIME && + clockid != CLOCK_BOOTTIME)) + return -EINVAL; + __bpf_spin_lock_irqsave(&timer->lock); + t = timer->timer; + if (t) { + ret = -EBUSY; + goto out; + } + /* allocate hrtimer via map_kmalloc to use memcg accounting */ + t = bpf_map_kmalloc_node(map, sizeof(*t), GFP_ATOMIC, NUMA_NO_NODE); + if (!t) { + ret = -ENOMEM; + goto out; + } + t->value = (void *)timer - map->timer_off; + t->map = map; + t->prog = NULL; + rcu_assign_pointer(t->callback_fn, NULL); + hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT); + t->timer.function = bpf_timer_cb; + timer->timer = t; +out: + __bpf_spin_unlock_irqrestore(&timer->lock); + return ret; +} + +static const struct bpf_func_proto bpf_timer_init_proto = { + .func = bpf_timer_init, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_TIMER, + .arg2_type = ARG_CONST_MAP_PTR, + .arg3_type = ARG_ANYTHING, +}; + +BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn, + struct bpf_prog_aux *, aux) +{ + struct bpf_prog *prev, *prog = aux->prog; + struct bpf_hrtimer *t; + int ret = 0; + + if (in_nmi()) + return -EOPNOTSUPP; + __bpf_spin_lock_irqsave(&timer->lock); + t = timer->timer; + if (!t) { + ret = -EINVAL; + goto out; + } + if (!atomic64_read(&(t->map->usercnt))) { + /* maps with timers must be either held by user space + * or pinned in bpffs. Otherwise timer might still be + * running even when bpf prog is detached and user space + * is gone, since map_release_uref won't ever be called. + */ + ret = -EPERM; + goto out; + } + prev = t->prog; + if (prev != prog) { + /* Bump prog refcnt once. Every bpf_timer_set_callback() + * can pick different callback_fn-s within the same prog. + */ + prog = bpf_prog_inc_not_zero(prog); + if (IS_ERR(prog)) { + ret = PTR_ERR(prog); + goto out; + } + if (prev) + /* Drop prev prog refcnt when swapping with new prog */ + bpf_prog_put(prev); + t->prog = prog; + } + rcu_assign_pointer(t->callback_fn, callback_fn); +out: + __bpf_spin_unlock_irqrestore(&timer->lock); + return ret; +} + +static const struct bpf_func_proto bpf_timer_set_callback_proto = { + .func = bpf_timer_set_callback, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_TIMER, + .arg2_type = ARG_PTR_TO_FUNC, +}; + +BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, flags) +{ + struct bpf_hrtimer *t; + int ret = 0; + + if (in_nmi()) + return -EOPNOTSUPP; + if (flags) + return -EINVAL; + __bpf_spin_lock_irqsave(&timer->lock); + t = timer->timer; + if (!t || !t->prog) { + ret = -EINVAL; + goto out; + } + hrtimer_start(&t->timer, ns_to_ktime(nsecs), HRTIMER_MODE_REL_SOFT); +out: + __bpf_spin_unlock_irqrestore(&timer->lock); + return ret; +} + +static const struct bpf_func_proto bpf_timer_start_proto = { + .func = bpf_timer_start, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_TIMER, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_ANYTHING, +}; + +static void drop_prog_refcnt(struct bpf_hrtimer *t) +{ + struct bpf_prog *prog = t->prog; + + if (prog) { + bpf_prog_put(prog); + t->prog = NULL; + rcu_assign_pointer(t->callback_fn, NULL); + } +} + +BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer) +{ + struct bpf_hrtimer *t; + int ret = 0; + + if (in_nmi()) + return -EOPNOTSUPP; + __bpf_spin_lock_irqsave(&timer->lock); + t = timer->timer; + if (!t) { + ret = -EINVAL; + goto out; + } + if (this_cpu_read(hrtimer_running) == t) { + /* If bpf callback_fn is trying to bpf_timer_cancel() + * its own timer the hrtimer_cancel() will deadlock + * since it waits for callback_fn to finish + */ + ret = -EDEADLK; + goto out; + } + drop_prog_refcnt(t); +out: + __bpf_spin_unlock_irqrestore(&timer->lock); + /* Cancel the timer and wait for associated callback to finish + * if it was running. + */ + ret = ret ?: hrtimer_cancel(&t->timer); + return ret; +} + +static const struct bpf_func_proto bpf_timer_cancel_proto = { + .func = bpf_timer_cancel, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_TIMER, +}; + +/* This function is called by map_delete/update_elem for individual element. + * By ops->map_release_uref when the user space reference to a map reaches zero + * and by ops->map_free when the kernel reference reaches zero. + */ +void bpf_timer_cancel_and_free(void *val) +{ + struct bpf_timer_kern *timer = val; + struct bpf_hrtimer *t; + + /* Performance optimization: read timer->timer without lock first. */ + if (!READ_ONCE(timer->timer)) + return; + + __bpf_spin_lock_irqsave(&timer->lock); + /* re-read it under lock */ + t = timer->timer; + if (!t) + goto out; + drop_prog_refcnt(t); + /* The subsequent bpf_timer_start/cancel() helpers won't be able to use + * this timer, since it won't be initialized. + */ + timer->timer = NULL; +out: + __bpf_spin_unlock_irqrestore(&timer->lock); + if (!t) + return; + /* Cancel the timer and wait for callback to complete if it was running. + * If hrtimer_cancel() can be safely called it's safe to call kfree(t) + * right after for both preallocated and non-preallocated maps. + * The timer->timer = NULL was already done and no code path can + * see address 't' anymore. + * + * Check that bpf_map_delete/update_elem() wasn't called from timer + * callback_fn. In such case don't call hrtimer_cancel() (since it will + * deadlock) and don't call hrtimer_try_to_cancel() (since it will just + * return -1). Though callback_fn is still running on this cpu it's + * safe to do kfree(t) because bpf_timer_cb() read everything it needed + * from 't'. The bpf subprog callback_fn won't be able to access 't', + * since timer->timer = NULL was already done. The timer will be + * effectively cancelled because bpf_timer_cb() will return + * HRTIMER_NORESTART. + */ + if (this_cpu_read(hrtimer_running) != t) + hrtimer_cancel(&t->timer); + kfree(t); +} + const struct bpf_func_proto bpf_get_current_task_proto __weak; const struct bpf_func_proto bpf_probe_read_user_proto __weak; const struct bpf_func_proto bpf_probe_read_user_str_proto __weak; @@ -1065,6 +1375,14 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_per_cpu_ptr_proto; case BPF_FUNC_this_cpu_ptr: return &bpf_this_cpu_ptr_proto; + case BPF_FUNC_timer_init: + return &bpf_timer_init_proto; + case BPF_FUNC_timer_set_callback: + return &bpf_timer_set_callback_proto; + case BPF_FUNC_timer_start: + return &bpf_timer_start_proto; + case BPF_FUNC_timer_cancel: + return &bpf_timer_cancel_proto; default: break; } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index be38bb930bf1..3d78933687ea 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4656,6 +4656,38 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno, return 0; } +static int process_timer_func(struct bpf_verifier_env *env, int regno, + struct bpf_call_arg_meta *meta) +{ + struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; + bool is_const = tnum_is_const(reg->var_off); + struct bpf_map *map = reg->map_ptr; + u64 val = reg->var_off.value; + + if (!is_const) { + verbose(env, + "R%d doesn't have constant offset. bpf_timer has to be at the constant offset\n", + regno); + return -EINVAL; + } + if (!map->btf) { + verbose(env, "map '%s' has to have BTF in order to use bpf_timer\n", + map->name); + return -EINVAL; + } + if (val) { + /* This restriction will be removed in the next patch */ + verbose(env, "bpf_timer field can only be first in the map value element\n"); + return -EINVAL; + } + if (meta->map_ptr) { + verbose(env, "verifier bug. Two map pointers in a timer helper\n"); + return -EFAULT; + } + meta->map_ptr = map; + return 0; +} + static bool arg_type_is_mem_ptr(enum bpf_arg_type type) { return type == ARG_PTR_TO_MEM || @@ -4788,6 +4820,7 @@ static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_PER static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } }; static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } }; static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } }; +static const struct bpf_reg_types timer_types = { .types = { PTR_TO_MAP_VALUE } }; static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_MAP_KEY] = &map_key_value_types, @@ -4819,6 +4852,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_FUNC] = &func_ptr_types, [ARG_PTR_TO_STACK_OR_NULL] = &stack_ptr_types, [ARG_PTR_TO_CONST_STR] = &const_str_ptr_types, + [ARG_PTR_TO_TIMER] = &timer_types, }; static int check_reg_type(struct bpf_verifier_env *env, u32 regno, @@ -4948,6 +4982,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, if (arg_type == ARG_CONST_MAP_PTR) { /* bpf_map_xxx(map_ptr) call: remember that map_ptr */ + if (meta->map_ptr && meta->map_ptr != reg->map_ptr) { + verbose(env, "Map pointer doesn't match bpf_timer.\n"); + return -EINVAL; + } meta->map_ptr = reg->map_ptr; } else if (arg_type == ARG_PTR_TO_MAP_KEY) { /* bpf_map_xxx(..., map_ptr, ..., key) call: @@ -5000,6 +5038,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, verbose(env, "verifier internal error\n"); return -EFAULT; } + } else if (arg_type == ARG_PTR_TO_TIMER) { + if (process_timer_func(env, regno, meta)) + return -EACCES; } else if (arg_type == ARG_PTR_TO_FUNC) { meta->subprogno = reg->subprogno; } else if (arg_type_is_mem_ptr(arg_type)) { @@ -5742,6 +5783,34 @@ static int set_map_elem_callback_state(struct bpf_verifier_env *env, return 0; } +static int set_timer_callback_state(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, + int insn_idx) +{ + struct bpf_map *map_ptr = caller->regs[BPF_REG_1].map_ptr; + + /* bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn); + * callback_fn(struct bpf_map *map, void *key, void *value); + */ + callee->regs[BPF_REG_1].type = CONST_PTR_TO_MAP; + __mark_reg_known_zero(&callee->regs[BPF_REG_1]); + callee->regs[BPF_REG_1].map_ptr = map_ptr; + + callee->regs[BPF_REG_2].type = PTR_TO_MAP_KEY; + __mark_reg_known_zero(&callee->regs[BPF_REG_2]); + callee->regs[BPF_REG_2].map_ptr = map_ptr; + + callee->regs[BPF_REG_3].type = PTR_TO_MAP_VALUE; + __mark_reg_known_zero(&callee->regs[BPF_REG_3]); + callee->regs[BPF_REG_3].map_ptr = map_ptr; + + /* unused */ + __mark_reg_not_init(env, &callee->regs[BPF_REG_4]); + __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); + return 0; +} + static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) { struct bpf_verifier_state *state = env->cur_state; @@ -6069,6 +6138,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn return -EINVAL; } + if (func_id == BPF_FUNC_timer_set_callback) { + err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, + set_timer_callback_state); + if (err < 0) + return -EINVAL; + } + if (func_id == BPF_FUNC_snprintf) { err = check_bpf_snprintf_call(env, regs); if (err < 0) @@ -12586,6 +12662,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env) continue; } + if (insn->imm == BPF_FUNC_timer_set_callback) { + /* The verifier will process callback_fn as many times as necessary + * with different maps and the register states prepared by + * set_timer_callback_state will be accurate. + * + * The following use case is valid: + * map1 is shared by prog1, prog2, prog3. + * prog1 calls bpf_timer_init for some map1 elements + * prog2 calls bpf_timer_set_callback for some map1 elements. + * Those that were not bpf_timer_init-ed will return -EINVAL. + * prog3 calls bpf_timer_start for some map1 elements. + * Those that were not both bpf_timer_init-ed and + * bpf_timer_set_callback-ed will return -EINVAL. + */ + struct bpf_insn ld_addrs[2] = { + BPF_LD_IMM64(BPF_REG_3, (long)prog->aux), + }; + + insn_buf[0] = ld_addrs[0]; + insn_buf[1] = ld_addrs[1]; + insn_buf[2] = *insn; + cnt = 3; + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + goto patch_call_imm; + } + /* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup * and other inlining handlers are currently limited to 64 bit * only. diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 64bd2d84367f..6c77d25137e0 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1059,7 +1059,7 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_snprintf: return &bpf_snprintf_proto; default: - return NULL; + return bpf_base_func_proto(func_id); } } diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py index 2d94025b38e9..00ac7b79cddb 100755 --- a/scripts/bpf_doc.py +++ b/scripts/bpf_doc.py @@ -547,6 +547,7 @@ COMMANDS 'struct inode', 'struct socket', 'struct file', + 'struct bpf_timer', ] known_types = { '...', @@ -594,6 +595,7 @@ COMMANDS 'struct inode', 'struct socket', 'struct file', + 'struct bpf_timer', } mapped_types = { 'u8': '__u8', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index bf9252c7381e..4993baaa65be 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4780,6 +4780,66 @@ union bpf_attr { * Execute close syscall for given FD. * Return * A syscall result. + * + * long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, u64 flags) + * Description + * Initialize the timer. + * First 4 bits of *flags* specify clockid. + * Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed. + * All other bits of *flags* are reserved. + * The verifier will reject the program if *timer* is not from + * the same *map*. + * Return + * 0 on success. + * **-EBUSY** if *timer* is already initialized. + * **-EINVAL** if invalid *flags* are passed. + * + * long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn) + * Description + * Configure the timer to call *callback_fn* static function. + * Return + * 0 on success. + * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier. + * **-EPERM** if *timer* is in a map that doesn't have any user references. + * The user space should either hold a file descriptor to a map with timers + * or pin such map in bpffs. When map is unpinned or file descriptor is + * closed all timers in the map will be cancelled and freed. + * + * long bpf_timer_start(struct bpf_timer *timer, u64 nsecs, u64 flags) + * Description + * Set timer expiration N nanoseconds from the current time. The + * configured callback will be invoked in soft irq context on some cpu + * and will not repeat unless another bpf_timer_start() is made. + * In such case the next invocation can migrate to a different cpu. + * Since struct bpf_timer is a field inside map element the map + * owns the timer. The bpf_timer_set_callback() will increment refcnt + * of BPF program to make sure that callback_fn code stays valid. + * When user space reference to a map reaches zero all timers + * in a map are cancelled and corresponding program's refcnts are + * decremented. This is done to make sure that Ctrl-C of a user + * process doesn't leave any timers running. If map is pinned in + * bpffs the callback_fn can re-arm itself indefinitely. + * bpf_map_update/delete_elem() helpers and user space sys_bpf commands + * cancel and free the timer in the given map element. + * The map can contain timers that invoke callback_fn-s from different + * programs. The same callback_fn can serve different timers from + * different maps if key/value layout matches across maps. + * Every bpf_timer_set_callback() can have different callback_fn. + * + * Return + * 0 on success. + * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier + * or invalid *flags* are passed. + * + * long bpf_timer_cancel(struct bpf_timer *timer) + * Description + * Cancel the timer and wait for callback_fn to finish if it was running. + * Return + * 0 if the timer was not active. + * 1 if the timer was active. + * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier. + * **-EDEADLK** if callback_fn tried to call bpf_timer_cancel() on its + * own timer which would have led to a deadlock otherwise. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4951,6 +5011,10 @@ union bpf_attr { FN(sys_bpf), \ FN(btf_find_by_name_kind), \ FN(sys_close), \ + FN(timer_init), \ + FN(timer_set_callback), \ + FN(timer_start), \ + FN(timer_cancel), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -6077,6 +6141,11 @@ struct bpf_spin_lock { __u32 val; }; +struct bpf_timer { + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + struct bpf_sysctl { __u32 write; /* Sysctl is being read (= 0) or written (= 1). * Allows 1,2,4-byte read, but no write. From patchwork Thu Jul 8 01:18:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 471690 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4AA2C11F67 for ; Thu, 8 Jul 2021 01:18:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B303861CD0 for ; Thu, 8 Jul 2021 01:18:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230271AbhGHBV2 (ORCPT ); Wed, 7 Jul 2021 21:21:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230232AbhGHBV0 (ORCPT ); Wed, 7 Jul 2021 21:21:26 -0400 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E23DBC061574; Wed, 7 Jul 2021 18:18:44 -0700 (PDT) Received: by mail-pg1-x52d.google.com with SMTP id u14so4197255pga.11; Wed, 07 Jul 2021 18:18:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fxq1khmxvhJCPEywLH5j4ieavAiBiNDqpiXDqrDME/Y=; b=hKuOnz2ABR6UNX5NbAa19aXjR5GCUM5enSlFVvyAUFE4oyKLbinQpDfEuPsbrqqxeL t7FICQynCsMnbE/zyKX9haiEGCHSzNmIrg+fyLt5aMvtUJgEp2P3Lvr671Wm8ocCB+04 8/zjAoLcBCMrD62xjGIVicSeQ3Kpchb02mGD3EC392hlRdkrBRkodexyzPq4O7qWeUsQ SqIuTg4GPdTJk62l7BF2e2MlsVFGHNT7Opv99ZLZfexL3o1aQs/Q1V3jdfZzkNuW9cBV 64uh6MKLdb31/Ap10N1/SavNa5tcv7WvQXsL0E1MmkJDKltgEzegjnk+gkVAzROeLyDZ eN2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fxq1khmxvhJCPEywLH5j4ieavAiBiNDqpiXDqrDME/Y=; b=tif119oi7lsDVw10PnJXh3+YVmEr3G9TKAeQyb7UW6d2tPnbOCCdKl3dx+xIXr2ujW cb3h/yzSSCaBJHzcr52ujK29X+BSuXwGXe+jjO8DruJb1AKjH1KBycK+d0fvqv7i5zW+ 0xenawMa5bBk7bp9/606FJfDDmuOuuYMlPSgM/xL1NjWU6PnQ0zSjkGyGiTJhBtQ8WhU cxsqzcYNUsYp8gNBgjdG61yP2pIYo4pvSXyIEJJnlWto0CvtGF7L+1vFB3mU4JmRtmFm bJXsWA3uxFB+CsBBwYra5SmVyos8CcxPH3QLBY2X+WPyx+ytSOpmpN5OiifWrV0K0Vs/ SWkw== X-Gm-Message-State: AOAM532xrSjhgn6+qt+xM1vWCT4jwuQsEBXDb1NXogQ02N3y6PdS6fT+ m77XejHVzfa/5wdaUAkKCkc= X-Google-Smtp-Source: ABdhPJy3573Ba/I0MFcz6qXt5+VqFCF4LVVD5HJWg90oMtfH8GOmeLkATQWT8XsjrLwNmK09gj1qiQ== X-Received: by 2002:aa7:9407:0:b029:31c:c870:4b05 with SMTP id x7-20020aa794070000b029031cc8704b05mr21892218pfo.23.1625707124258; Wed, 07 Jul 2021 18:18:44 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.42 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:43 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 04/11] bpf: Add map side support for bpf timers. Date: Wed, 7 Jul 2021 18:18:26 -0700 Message-Id: <20210708011833.67028-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Restrict bpf timers to array, hash (both preallocated and kmalloced), and lru map types. The per-cpu maps with timers don't make sense, since 'struct bpf_timer' is a part of map value. bpf timers in per-cpu maps would mean that the number of timers depends on number of possible cpus and timers would not be accessible from all cpus. lpm map support can be added in the future. The timers in inner maps are supported. The bpf_map_update/delete_elem() helpers and sys_bpf commands cancel and free bpf_timer in a given map element. Similar to 'struct bpf_spin_lock' BTF is required and it is used to validate that map element indeed contains 'struct bpf_timer'. Make check_and_init_map_value() init both bpf_spin_lock and bpf_timer when map element data is reused in preallocated htab and lru maps. Teach copy_map_value() to support both bpf_spin_lock and bpf_timer in a single map element. There could be one of each, but not more than one. Due to 'one bpf_timer in one element' restriction do not support timers in global data, since global data is a map of single element, but from bpf program side it's seen as many global variables and restriction of single global timer would be odd. The sys_bpf map_freeze and sys_mmap syscalls are not allowed on maps with timers, since user space could have corrupted mmap element and crashed the kernel. The maps with timers cannot be readonly. Due to these restrictions search for bpf_timer in datasec BTF in case it was placed in the global data to report clear error. The previous patch allowed 'struct bpf_timer' as a first field in a map element only. Relax this restriction. Refactor lru map to s/bpf_lru_push_free/htab_lru_push_free/ to cancel and free the timer when lru map deletes an element as a part of it eviction algorithm. Make sure that bpf program cannot access 'struct bpf_timer' via direct load/store. The timer operation are done through helpers only. This is similar to 'struct bpf_spin_lock'. Signed-off-by: Alexei Starovoitov Acked-by: Yonghong Song --- include/linux/bpf.h | 44 ++++++++++++---- include/linux/btf.h | 1 + kernel/bpf/arraymap.c | 22 ++++++++ kernel/bpf/btf.c | 77 ++++++++++++++++++++++----- kernel/bpf/hashtab.c | 104 ++++++++++++++++++++++++++++++++----- kernel/bpf/local_storage.c | 4 +- kernel/bpf/map_in_map.c | 2 + kernel/bpf/syscall.c | 21 ++++++-- kernel/bpf/verifier.c | 30 +++++++++-- 9 files changed, 259 insertions(+), 46 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 72da9d4d070c..90e6c6d1404c 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -198,24 +198,46 @@ static inline bool map_value_has_spin_lock(const struct bpf_map *map) return map->spin_lock_off >= 0; } -static inline void check_and_init_map_lock(struct bpf_map *map, void *dst) +static inline bool map_value_has_timer(const struct bpf_map *map) { - if (likely(!map_value_has_spin_lock(map))) - return; - *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = - (struct bpf_spin_lock){}; + return map->timer_off >= 0; } -/* copy everything but bpf_spin_lock */ +static inline void check_and_init_map_value(struct bpf_map *map, void *dst) +{ + if (unlikely(map_value_has_spin_lock(map))) + *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = + (struct bpf_spin_lock){}; + if (unlikely(map_value_has_timer(map))) + *(struct bpf_timer *)(dst + map->timer_off) = + (struct bpf_timer){}; +} + +/* copy everything but bpf_spin_lock and bpf_timer. There could be one of each. */ static inline void copy_map_value(struct bpf_map *map, void *dst, void *src) { + u32 s_off = 0, s_sz = 0, t_off = 0, t_sz = 0; + if (unlikely(map_value_has_spin_lock(map))) { - u32 off = map->spin_lock_off; + s_off = map->spin_lock_off; + s_sz = sizeof(struct bpf_spin_lock); + } else if (unlikely(map_value_has_timer(map))) { + t_off = map->timer_off; + t_sz = sizeof(struct bpf_timer); + } - memcpy(dst, src, off); - memcpy(dst + off + sizeof(struct bpf_spin_lock), - src + off + sizeof(struct bpf_spin_lock), - map->value_size - off - sizeof(struct bpf_spin_lock)); + if (unlikely(s_sz || t_sz)) { + if (s_off < t_off || !s_sz) { + swap(s_off, t_off); + swap(s_sz, t_sz); + } + memcpy(dst, src, t_off); + memcpy(dst + t_off + t_sz, + src + t_off + t_sz, + s_off - t_off - t_sz); + memcpy(dst + s_off + s_sz, + src + s_off + s_sz, + map->value_size - s_off - s_sz); } else { memcpy(dst, src, map->value_size); } diff --git a/include/linux/btf.h b/include/linux/btf.h index 94a0c976c90f..214fde93214b 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -99,6 +99,7 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t); +int btf_find_timer(const struct btf *btf, const struct btf_type *t); bool btf_type_is_void(const struct btf_type *t); s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind); const struct btf_type *btf_type_skip_modifiers(const struct btf *btf, diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 3c4105603f9d..7f07bb7adf63 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -287,6 +287,12 @@ static int array_map_get_next_key(struct bpf_map *map, void *key, void *next_key return 0; } +static void check_and_free_timer_in_array(struct bpf_array *arr, void *val) +{ + if (unlikely(map_value_has_timer(&arr->map))) + bpf_timer_cancel_and_free(val + arr->map.timer_off); +} + /* Called from syscall or from eBPF program */ static int array_map_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags) @@ -321,6 +327,7 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value, copy_map_value_locked(map, val, value, false); else copy_map_value(map, val, value); + check_and_free_timer_in_array(array, val); } return 0; } @@ -374,6 +381,19 @@ static void *array_map_vmalloc_addr(struct bpf_array *array) return (void *)round_down((unsigned long)array, PAGE_SIZE); } +static void array_map_free_timers(struct bpf_map *map) +{ + struct bpf_array *array = container_of(map, struct bpf_array, map); + int i; + + if (likely(!map_value_has_timer(map))) + return; + + for (i = 0; i < array->map.max_entries; i++) + bpf_timer_cancel_and_free(array->value + array->elem_size * i + + map->timer_off); +} + /* Called when map->refcnt goes to zero, either from workqueue or from syscall */ static void array_map_free(struct bpf_map *map) { @@ -382,6 +402,7 @@ static void array_map_free(struct bpf_map *map) if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) bpf_array_free_percpu(array); + array_map_free_timers(map); if (array->map.map_flags & BPF_F_MMAPABLE) bpf_map_area_free(array_map_vmalloc_addr(array)); else @@ -668,6 +689,7 @@ const struct bpf_map_ops array_map_ops = { .map_alloc = array_map_alloc, .map_free = array_map_free, .map_get_next_key = array_map_get_next_key, + .map_release_uref = array_map_free_timers, .map_lookup_elem = array_map_lookup_elem, .map_update_elem = array_map_update_elem, .map_delete_elem = array_map_delete_elem, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index cb4b72997d9b..7780131f710e 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3046,43 +3046,92 @@ static void btf_struct_log(struct btf_verifier_env *env, btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t)); } -/* find 'struct bpf_spin_lock' in map value. - * return >= 0 offset if found - * and < 0 in case of error - */ -int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t) +static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t, + const char *name, int sz, int align) { const struct btf_member *member; u32 i, off = -ENOENT; - if (!__btf_type_is_struct(t)) - return -EINVAL; - for_each_member(i, t, member) { const struct btf_type *member_type = btf_type_by_id(btf, member->type); if (!__btf_type_is_struct(member_type)) continue; - if (member_type->size != sizeof(struct bpf_spin_lock)) + if (member_type->size != sz) continue; - if (strcmp(__btf_name_by_offset(btf, member_type->name_off), - "bpf_spin_lock")) + if (strcmp(__btf_name_by_offset(btf, member_type->name_off), name)) continue; if (off != -ENOENT) - /* only one 'struct bpf_spin_lock' is allowed */ + /* only one such field is allowed */ return -E2BIG; off = btf_member_bit_offset(t, member); if (off % 8) /* valid C code cannot generate such BTF */ return -EINVAL; off /= 8; - if (off % __alignof__(struct bpf_spin_lock)) - /* valid struct bpf_spin_lock will be 4 byte aligned */ + if (off % align) + return -EINVAL; + } + return off; +} + +static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t, + const char *name, int sz, int align) +{ + const struct btf_var_secinfo *vsi; + u32 i, off = -ENOENT; + + for_each_vsi(i, t, vsi) { + const struct btf_type *var = btf_type_by_id(btf, vsi->type); + const struct btf_type *var_type = btf_type_by_id(btf, var->type); + + if (!__btf_type_is_struct(var_type)) + continue; + if (var_type->size != sz) + continue; + if (vsi->size != sz) + continue; + if (strcmp(__btf_name_by_offset(btf, var_type->name_off), name)) + continue; + if (off != -ENOENT) + /* only one such field is allowed */ + return -E2BIG; + off = vsi->offset; + if (off % align) return -EINVAL; } return off; } +static int btf_find_field(const struct btf *btf, const struct btf_type *t, + const char *name, int sz, int align) +{ + + if (__btf_type_is_struct(t)) + return btf_find_struct_field(btf, t, name, sz, align); + else if (btf_type_is_datasec(t)) + return btf_find_datasec_var(btf, t, name, sz, align); + return -EINVAL; +} + +/* find 'struct bpf_spin_lock' in map value. + * return >= 0 offset if found + * and < 0 in case of error + */ +int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t) +{ + return btf_find_field(btf, t, "bpf_spin_lock", + sizeof(struct bpf_spin_lock), + __alignof__(struct bpf_spin_lock)); +} + +int btf_find_timer(const struct btf *btf, const struct btf_type *t) +{ + return btf_find_field(btf, t, "bpf_timer", + sizeof(struct bpf_timer), + __alignof__(struct bpf_timer)); +} + static void __btf_struct_show(const struct btf *btf, const struct btf_type *t, u32 type_id, void *data, u8 bits_offset, struct btf_show *show) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 72c58cc516a3..4ac7d5511cea 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -228,6 +228,32 @@ static struct htab_elem *get_htab_elem(struct bpf_htab *htab, int i) return (struct htab_elem *) (htab->elems + i * (u64)htab->elem_size); } +static bool htab_has_extra_elems(struct bpf_htab *htab) +{ + return !htab_is_percpu(htab) && !htab_is_lru(htab); +} + +static void htab_free_prealloced_timers(struct bpf_htab *htab) +{ + u32 num_entries = htab->map.max_entries; + int i; + + if (likely(!map_value_has_timer(&htab->map))) + return; + if (htab_has_extra_elems(htab)) + num_entries += num_possible_cpus(); + + for (i = 0; i < num_entries; i++) { + struct htab_elem *elem; + + elem = get_htab_elem(htab, i); + bpf_timer_cancel_and_free(elem->key + + round_up(htab->map.key_size, 8) + + htab->map.timer_off); + cond_resched(); + } +} + static void htab_free_elems(struct bpf_htab *htab) { int i; @@ -244,6 +270,7 @@ static void htab_free_elems(struct bpf_htab *htab) cond_resched(); } free_elems: + htab_free_prealloced_timers(htab); bpf_map_area_free(htab->elems); } @@ -265,8 +292,11 @@ static struct htab_elem *prealloc_lru_pop(struct bpf_htab *htab, void *key, struct htab_elem *l; if (node) { + u32 key_size = htab->map.key_size; l = container_of(node, struct htab_elem, lru_node); - memcpy(l->key, key, htab->map.key_size); + memcpy(l->key, key, key_size); + check_and_init_map_value(&htab->map, + l->key + round_up(key_size, 8)); return l; } @@ -278,7 +308,7 @@ static int prealloc_init(struct bpf_htab *htab) u32 num_entries = htab->map.max_entries; int err = -ENOMEM, i; - if (!htab_is_percpu(htab) && !htab_is_lru(htab)) + if (htab_has_extra_elems(htab)) num_entries += num_possible_cpus(); htab->elems = bpf_map_area_alloc((u64)htab->elem_size * num_entries, @@ -695,6 +725,14 @@ static int htab_lru_map_gen_lookup(struct bpf_map *map, return insn - insn_buf; } +static void check_and_free_timer(struct bpf_htab *htab, struct htab_elem *elem) +{ + if (unlikely(map_value_has_timer(&htab->map))) + bpf_timer_cancel_and_free(elem->key + + round_up(htab->map.key_size, 8) + + htab->map.timer_off); +} + /* It is called from the bpf_lru_list when the LRU needs to delete * older elements from the htab. */ @@ -719,6 +757,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) if (l == tgt_l) { hlist_nulls_del_rcu(&l->hash_node); + check_and_free_timer(htab, l); break; } @@ -790,6 +829,7 @@ static void htab_elem_free(struct bpf_htab *htab, struct htab_elem *l) { if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH) free_percpu(htab_elem_get_ptr(l, htab->map.key_size)); + check_and_free_timer(htab, l); kfree(l); } @@ -817,6 +857,7 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l) htab_put_fd_value(htab, l); if (htab_is_prealloc(htab)) { + check_and_free_timer(htab, l); __pcpu_freelist_push(&htab->freelist, &l->fnode); } else { atomic_dec(&htab->count); @@ -920,8 +961,8 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, l_new = ERR_PTR(-ENOMEM); goto dec_count; } - check_and_init_map_lock(&htab->map, - l_new->key + round_up(key_size, 8)); + check_and_init_map_value(&htab->map, + l_new->key + round_up(key_size, 8)); } memcpy(l_new->key, key, key_size); @@ -1062,6 +1103,8 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value, hlist_nulls_del_rcu(&l_old->hash_node); if (!htab_is_prealloc(htab)) free_htab_elem(htab, l_old); + else + check_and_free_timer(htab, l_old); } ret = 0; err: @@ -1069,6 +1112,12 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value, return ret; } +static void htab_lru_push_free(struct bpf_htab *htab, struct htab_elem *elem) +{ + check_and_free_timer(htab, elem); + bpf_lru_push_free(&htab->lru, &elem->lru_node); +} + static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags) { @@ -1102,7 +1151,8 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value, l_new = prealloc_lru_pop(htab, key, hash); if (!l_new) return -ENOMEM; - memcpy(l_new->key + round_up(map->key_size, 8), value, map->value_size); + copy_map_value(&htab->map, + l_new->key + round_up(map->key_size, 8), value); ret = htab_lock_bucket(htab, b, hash, &flags); if (ret) @@ -1128,9 +1178,9 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value, htab_unlock_bucket(htab, b, hash, flags); if (ret) - bpf_lru_push_free(&htab->lru, &l_new->lru_node); + htab_lru_push_free(htab, l_new); else if (l_old) - bpf_lru_push_free(&htab->lru, &l_old->lru_node); + htab_lru_push_free(htab, l_old); return ret; } @@ -1339,7 +1389,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key) htab_unlock_bucket(htab, b, hash, flags); if (l) - bpf_lru_push_free(&htab->lru, &l->lru_node); + htab_lru_push_free(htab, l); return ret; } @@ -1359,6 +1409,34 @@ static void delete_all_elements(struct bpf_htab *htab) } } +static void htab_free_malloced_timers(struct bpf_htab *htab) +{ + int i; + + rcu_read_lock(); + for (i = 0; i < htab->n_buckets; i++) { + struct hlist_nulls_head *head = select_bucket(htab, i); + struct hlist_nulls_node *n; + struct htab_elem *l; + + hlist_nulls_for_each_entry(l, n, head, hash_node) + check_and_free_timer(htab, l); + } + rcu_read_unlock(); +} + +static void htab_map_free_timers(struct bpf_map *map) +{ + struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + + if (likely(!map_value_has_timer(&htab->map))) + return; + if (!htab_is_prealloc(htab)) + htab_free_malloced_timers(htab); + else + htab_free_prealloced_timers(htab); +} + /* Called when map->refcnt goes to zero, either from workqueue or from syscall */ static void htab_map_free(struct bpf_map *map) { @@ -1456,7 +1534,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, else copy_map_value(map, value, l->key + roundup_key_size); - check_and_init_map_lock(map, value); + check_and_init_map_value(map, value); } hlist_nulls_del_rcu(&l->hash_node); @@ -1467,7 +1545,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, htab_unlock_bucket(htab, b, hash, bflags); if (is_lru_map && l) - bpf_lru_push_free(&htab->lru, &l->lru_node); + htab_lru_push_free(htab, l); return ret; } @@ -1645,7 +1723,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, true); else copy_map_value(map, dst_val, value); - check_and_init_map_lock(map, dst_val); + check_and_init_map_value(map, dst_val); } if (do_delete) { hlist_nulls_del_rcu(&l->hash_node); @@ -1672,7 +1750,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, while (node_to_free) { l = node_to_free; node_to_free = node_to_free->batch_flink; - bpf_lru_push_free(&htab->lru, &l->lru_node); + htab_lru_push_free(htab, l); } next_batch: @@ -2034,6 +2112,7 @@ const struct bpf_map_ops htab_map_ops = { .map_alloc = htab_map_alloc, .map_free = htab_map_free, .map_get_next_key = htab_map_get_next_key, + .map_release_uref = htab_map_free_timers, .map_lookup_elem = htab_map_lookup_elem, .map_lookup_and_delete_elem = htab_map_lookup_and_delete_elem, .map_update_elem = htab_map_update_elem, @@ -2055,6 +2134,7 @@ const struct bpf_map_ops htab_lru_map_ops = { .map_alloc = htab_map_alloc, .map_free = htab_map_free, .map_get_next_key = htab_map_get_next_key, + .map_release_uref = htab_map_free_timers, .map_lookup_elem = htab_lru_map_lookup_elem, .map_lookup_and_delete_elem = htab_lru_map_lookup_and_delete_elem, .map_lookup_elem_sys_only = htab_lru_map_lookup_elem_sys, diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index bd11db9774c3..95d70a08325d 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -173,7 +173,7 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *key, return -ENOMEM; memcpy(&new->data[0], value, map->value_size); - check_and_init_map_lock(map, new->data); + check_and_init_map_value(map, new->data); new = xchg(&storage->buf, new); kfree_rcu(new, rcu); @@ -509,7 +509,7 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog, map->numa_node); if (!storage->buf) goto enomem; - check_and_init_map_lock(map, storage->buf->data); + check_and_init_map_value(map, storage->buf->data); } else { storage->percpu_buf = bpf_map_alloc_percpu(map, size, 8, gfp); if (!storage->percpu_buf) diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c index 39ab0b68cade..890dfe14e731 100644 --- a/kernel/bpf/map_in_map.c +++ b/kernel/bpf/map_in_map.c @@ -50,6 +50,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd) inner_map_meta->map_flags = inner_map->map_flags; inner_map_meta->max_entries = inner_map->max_entries; inner_map_meta->spin_lock_off = inner_map->spin_lock_off; + inner_map_meta->timer_off = inner_map->timer_off; /* Misc members not needed in bpf_map_meta_equal() check. */ inner_map_meta->ops = inner_map->ops; @@ -75,6 +76,7 @@ bool bpf_map_meta_equal(const struct bpf_map *meta0, return meta0->map_type == meta1->map_type && meta0->key_size == meta1->key_size && meta0->value_size == meta1->value_size && + meta0->timer_off == meta1->timer_off && meta0->map_flags == meta1->map_flags; } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 5d1fee634be8..9a2068e39d23 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -260,8 +260,8 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, copy_map_value_locked(map, value, ptr, true); else copy_map_value(map, value, ptr); - /* mask lock, since value wasn't zero inited */ - check_and_init_map_lock(map, value); + /* mask lock and timer, since value wasn't zero inited */ + check_and_init_map_value(map, value); } rcu_read_unlock(); } @@ -623,7 +623,8 @@ static int bpf_map_mmap(struct file *filp, struct vm_area_struct *vma) struct bpf_map *map = filp->private_data; int err; - if (!map->ops->map_mmap || map_value_has_spin_lock(map)) + if (!map->ops->map_mmap || map_value_has_spin_lock(map) || + map_value_has_timer(map)) return -ENOTSUPP; if (!(vma->vm_flags & VM_SHARED)) @@ -793,6 +794,16 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, } } + map->timer_off = btf_find_timer(btf, value_type); + if (map_value_has_timer(map)) { + if (map->map_flags & BPF_F_RDONLY_PROG) + return -EACCES; + if (map->map_type != BPF_MAP_TYPE_HASH && + map->map_type != BPF_MAP_TYPE_LRU_HASH && + map->map_type != BPF_MAP_TYPE_ARRAY) + return -EOPNOTSUPP; + } + if (map->ops->map_check_btf) ret = map->ops->map_check_btf(map, btf, key_type, value_type); @@ -844,6 +855,7 @@ static int map_create(union bpf_attr *attr) mutex_init(&map->freeze_mutex); map->spin_lock_off = -EINVAL; + map->timer_off = -EINVAL; if (attr->btf_key_type_id || attr->btf_value_type_id || /* Even the map's value is a kernel's struct, * the bpf_prog.o must have BTF to begin with @@ -1591,7 +1603,8 @@ static int map_freeze(const union bpf_attr *attr) if (IS_ERR(map)) return PTR_ERR(map); - if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { + if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || + map_value_has_timer(map)) { fdput(f); return -ENOTSUPP; } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3d78933687ea..e44c36107d11 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3241,6 +3241,15 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno, return -EACCES; } } + if (map_value_has_timer(map)) { + u32 t = map->timer_off; + + if (reg->smin_value + off < t + sizeof(struct bpf_timer) && + t < reg->umax_value + off + size) { + verbose(env, "bpf_timer cannot be accessed directly by load/store\n"); + return -EACCES; + } + } return err; } @@ -4675,9 +4684,24 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno, map->name); return -EINVAL; } - if (val) { - /* This restriction will be removed in the next patch */ - verbose(env, "bpf_timer field can only be first in the map value element\n"); + if (!map_value_has_timer(map)) { + if (map->timer_off == -E2BIG) + verbose(env, + "map '%s' has more than one 'struct bpf_timer'\n", + map->name); + else if (map->timer_off == -ENOENT) + verbose(env, + "map '%s' doesn't have 'struct bpf_timer'\n", + map->name); + else + verbose(env, + "map '%s' is not a struct type or bpf_timer is mangled\n", + map->name); + return -EINVAL; + } + if (map->timer_off != val + reg->off) { + verbose(env, "off %lld doesn't point to 'struct bpf_timer' that is at %d\n", + val + reg->off, map->timer_off); return -EINVAL; } if (meta->map_ptr) { From patchwork Thu Jul 8 01:18:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 472258 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A858AC07E9C for ; Thu, 8 Jul 2021 01:18:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9321B61CD0 for ; Thu, 8 Jul 2021 01:18:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230265AbhGHBVb (ORCPT ); Wed, 7 Jul 2021 21:21:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230266AbhGHBV2 (ORCPT ); Wed, 7 Jul 2021 21:21:28 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E0DAC061574; Wed, 7 Jul 2021 18:18:46 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id q10so3925188pfj.12; Wed, 07 Jul 2021 18:18:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=b5bXTMctrGiuYkOisntmpk5rXHpFme1XzOtvPb/WvUI=; b=qC+5tUc0MwhhyiWD7MX7esKBwG2JHd0brLtUd7XKFGo2ryvi21xPtBG7eGx908TEnR sVmIy6CkGFI7AfBVrZtArPbcYT16WZ9kuYsKObNZbdXd3SfUuS6aSMXDtQ10u4E4ectD jpMQGqYVKZz+U9+4jXUqJKIebHTkluMi+c38I1WuB+ZG2/kIvWeDgz5uN+bFzbRYSHvA cLS02X7cH9tff7iI10NkMKq5vfsr73m9pXUpWGT/AKZm6f7N58f8dml71kkAPxqwdMLo PWnR2vkKCO3yYqJZwLdxeP4IAaHepOk+na2XVAfpQ2MFGjVyKEdJzJMGzl23R0KQh/re BpQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=b5bXTMctrGiuYkOisntmpk5rXHpFme1XzOtvPb/WvUI=; b=m9QM6dasoxAKYi64ggrk84tLVO326nx2sh142XmgpGQNuJPEKSBSjunoGEC847sd+v aM341cJ2EgxZvqdF+6z9ZrBK5TM0otZVhYGu0oJIvRZPTBIxLnaiXFr1lBlidKUfTTRX skbsM8cLJhs+KeljFb5zQHOS7HZnwXXlGs6zz35tBdbFnA0jvzDSkRonDD9Og/gPBih5 p66Lmg4zAAxb16XRbp7aQqwN/g0wXpx0nzxssyGtHqh5VZ28ZfzlfTgdSGUVSfZ9psf/ l3+9QDAzhgWB/sLXFJMlZixfpLdkqoDovq7NcTpKF8YM+v8vV6AFDjkEbDLin0LwVBct iPbg== X-Gm-Message-State: AOAM532dV8EaRqywpa/z5fykI8615BtPJRWg4LZuedskwBKzLD387YHc IcDvjOH5vYGuE1bvuUNhV+A= X-Google-Smtp-Source: ABdhPJyhYOZlZJncvFckZtLSLLPhOmH+05WFsCJPTaSxbjaKaJG/njs09wLJN8s2SU4CXt2QzDKbdQ== X-Received: by 2002:a65:4382:: with SMTP id m2mr13053661pgp.205.1625707125878; Wed, 07 Jul 2021 18:18:45 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:45 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 05/11] bpf: Prevent pointer mismatch in bpf_timer_init. Date: Wed, 7 Jul 2021 18:18:27 -0700 Message-Id: <20210708011833.67028-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov bpf_timer_init() arguments are: 1. pointer to a timer (which is embedded in map element). 2. pointer to a map. Make sure that pointer to a timer actually belongs to that map. Use map_uid (which is unique id of inner map) to reject: inner_map1 = bpf_map_lookup_elem(outer_map, key1) inner_map2 = bpf_map_lookup_elem(outer_map, key2) if (inner_map1 && inner_map2) { timer = bpf_map_lookup_elem(inner_map1); if (timer) // mismatch would have been allowed bpf_timer_init(timer, inner_map2); } Signed-off-by: Alexei Starovoitov --- include/linux/bpf_verifier.h | 9 ++++++++- kernel/bpf/verifier.c | 31 ++++++++++++++++++++++++++++--- 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index e774ecc1cd1f..5d3169b57e6e 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -53,7 +53,14 @@ struct bpf_reg_state { /* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE | * PTR_TO_MAP_VALUE_OR_NULL */ - struct bpf_map *map_ptr; + struct { + struct bpf_map *map_ptr; + /* To distinguish map lookups from outer map + * the map_uid is non-zero for registers + * pointing to inner maps. + */ + u32 map_uid; + }; /* for PTR_TO_BTF_ID */ struct { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e44c36107d11..cb393de3c818 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -255,6 +255,7 @@ struct bpf_call_arg_meta { int mem_size; u64 msize_max_value; int ref_obj_id; + int map_uid; int func_id; struct btf *btf; u32 btf_id; @@ -1135,6 +1136,10 @@ static void mark_ptr_not_null_reg(struct bpf_reg_state *reg) if (map->inner_map_meta) { reg->type = CONST_PTR_TO_MAP; reg->map_ptr = map->inner_map_meta; + /* transfer reg's id which is unique for every map_lookup_elem + * as UID of the inner map. + */ + reg->map_uid = reg->id; } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { reg->type = PTR_TO_XDP_SOCK; } else if (map->map_type == BPF_MAP_TYPE_SOCKMAP || @@ -4708,6 +4713,7 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno, verbose(env, "verifier bug. Two map pointers in a timer helper\n"); return -EFAULT; } + meta->map_uid = reg->map_uid; meta->map_ptr = map; return 0; } @@ -5006,11 +5012,29 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, if (arg_type == ARG_CONST_MAP_PTR) { /* bpf_map_xxx(map_ptr) call: remember that map_ptr */ - if (meta->map_ptr && meta->map_ptr != reg->map_ptr) { - verbose(env, "Map pointer doesn't match bpf_timer.\n"); - return -EINVAL; + if (meta->map_ptr) { + /* Use map_uid (which is unique id of inner map) to reject: + * inner_map1 = bpf_map_lookup_elem(outer_map, key1) + * inner_map2 = bpf_map_lookup_elem(outer_map, key2) + * if (inner_map1 && inner_map2) { + * timer = bpf_map_lookup_elem(inner_map1); + * if (timer) + * // mismatch would have been allowed + * bpf_timer_init(timer, inner_map2); + * } + * + * Comparing map_ptr is enough to distinguish normal and outer maps. + */ + if (meta->map_ptr != reg->map_ptr || + meta->map_uid != reg->map_uid) { + verbose(env, + "timer pointer in R1 map_uid=%d doesn't match map pointer in R2 map_uid=%d\n", + meta->map_uid, reg->map_uid); + return -EINVAL; + } } meta->map_ptr = reg->map_ptr; + meta->map_uid = reg->map_uid; } else if (arg_type == ARG_PTR_TO_MAP_KEY) { /* bpf_map_xxx(..., map_ptr, ..., key) call: * check that [key, key + map->key_size) are within @@ -6204,6 +6228,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn return -EINVAL; } regs[BPF_REG_0].map_ptr = meta.map_ptr; + regs[BPF_REG_0].map_uid = meta.map_uid; if (fn->ret_type == RET_PTR_TO_MAP_VALUE) { regs[BPF_REG_0].type = PTR_TO_MAP_VALUE; if (map_value_has_spin_lock(meta.map_ptr)) From patchwork Thu Jul 8 01:18:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 471689 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F696C07E95 for ; Thu, 8 Jul 2021 01:18:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A49F61CD0 for ; Thu, 8 Jul 2021 01:18:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230339AbhGHBVc (ORCPT ); Wed, 7 Jul 2021 21:21:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230284AbhGHBV3 (ORCPT ); Wed, 7 Jul 2021 21:21:29 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EEE04C061764; Wed, 7 Jul 2021 18:18:47 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 62so4251079pgf.1; Wed, 07 Jul 2021 18:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1CgQ/PnEyWs6JaL1Gp3My3DCnI1gq2qYMVWUFv4N6hc=; b=P+o9Ycjq0p2Yk/XWgJh/w35pOjco16RzbfAJlKaGH/AQTFr98zMfBfR+35N9Gb5JRL eTuJ5PK90vDbrpLpzwvy3fXSU9xADcabEv7qMOK5/rk7itXdRjinEJKIWbZ6ZLa71wiZ 55KMbzFFqnFsBNSxPrp+NgbWy0Vu06/c6cZT/z8cqs5ASOhLnpbt0dPhsRBrWoQQOiby 9AcqAC2VvosnEbfOz/lTjaF6UnkRKaVZLe6o13FT8vjFNgSweYSRrlHuRuno3UkvGlZe ki06W+qY+4wcvcf8PvVZimyeuDJqsZxLB652i6u6LsV8qGPHPjEL7wIDTiQfIHfD9Nr+ vVQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1CgQ/PnEyWs6JaL1Gp3My3DCnI1gq2qYMVWUFv4N6hc=; b=FcyrS3aPF/7fU+yFBodC97oA3qs/NRzShw2q/C4vrB0qFm+21LugULJk2vRu+335Py itDmZfSv0dNeeryqJ0jGONBQ2s81M9sxWzOkzX4fbvGIhXVE1/BTMLhH3olYT8X5nwsO 66mKhU0RxL7my5iZyoQtZ9BUC9FlyNuQf4Iz5kSrfIKqxuSiiSjFw2eNDrvNtr6WGk6W XQEAVdcNA+Na9g+kcwPiTaVrks9obaUMgxZQ6UiwVGqfV3eNPD/zJ/af0JQO6G0NaCQP jZKvQtJbHTvxei/37IuWSyPvhm1t/mI5HU+ISMh3mxvJupQtOjZZCvMv4obOZxfmsJqE ehMw== X-Gm-Message-State: AOAM532qj2cmt6qkiSqSNNyGxJ90FZf0b/EB7RDHruJF3EaGOoP4ol5i f2V7Sq/RjMGoWsPCweX6eK4= X-Google-Smtp-Source: ABdhPJx79DorCpbv1xV5f2n7H5JfubHXXsrrpLyA3cGjY+9a1FvEK+m4bHU/z0xcimVualrAnxvcJw== X-Received: by 2002:a63:1c20:: with SMTP id c32mr28865680pgc.41.1625707127526; Wed, 07 Jul 2021 18:18:47 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.46 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:47 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 06/11] bpf: Remember BTF of inner maps. Date: Wed, 7 Jul 2021 18:18:28 -0700 Message-Id: <20210708011833.67028-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov BTF is required for 'struct bpf_timer' to be recognized inside map value. The bpf timers are supported inside inner maps. Remember 'struct btf *' in inner_map_meta to make it available to the verifier in the sequence: struct bpf_map *inner_map = bpf_map_lookup_elem(&outer_map, ...); if (inner_map) timer = bpf_map_lookup_elem(&inner_map, ...); Signed-off-by: Alexei Starovoitov Acked-by: Yonghong Song --- kernel/bpf/map_in_map.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c index 890dfe14e731..5cd8f5277279 100644 --- a/kernel/bpf/map_in_map.c +++ b/kernel/bpf/map_in_map.c @@ -3,6 +3,7 @@ */ #include #include +#include #include "map_in_map.h" @@ -51,6 +52,10 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd) inner_map_meta->max_entries = inner_map->max_entries; inner_map_meta->spin_lock_off = inner_map->spin_lock_off; inner_map_meta->timer_off = inner_map->timer_off; + if (inner_map->btf) { + btf_get(inner_map->btf); + inner_map_meta->btf = inner_map->btf; + } /* Misc members not needed in bpf_map_meta_equal() check. */ inner_map_meta->ops = inner_map->ops; @@ -66,6 +71,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd) void bpf_map_meta_free(struct bpf_map *map_meta) { + btf_put(map_meta->btf); kfree(map_meta); } From patchwork Thu Jul 8 01:18:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 471688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBD2EC07E9B for ; Thu, 8 Jul 2021 01:18:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C5D8A61CF2 for ; Thu, 8 Jul 2021 01:18:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230328AbhGHBVf (ORCPT ); Wed, 7 Jul 2021 21:21:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230317AbhGHBVc (ORCPT ); Wed, 7 Jul 2021 21:21:32 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBC4BC061768; Wed, 7 Jul 2021 18:18:49 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id s18so4216832pgg.8; Wed, 07 Jul 2021 18:18:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=PDnPozsiQEH5D0xDZSajlCyRC9yDJg8xI1V4UhI3dBY=; b=OFOTdE7Yy+IXCcC8PqOoSMeW03emeQB/2lFfTZgm4MpIW81f+HJ+/C2LTN1ZWkPfZ4 67Y0rJbTUulFVLwKkfy3eiuI4nQdeHxddAaeHvrNMtv2kB645X8qOKh5FwJWF97oH5gq PIImnFnk1ZzL6YEb+JRg2rE1yqcWnS3nlfx6Lh6lBX0mbI4IJ7UZbWuQPbNqajUY2fty JLipQRLFsmMiLs8pPmym0bV5eosBQ8MxNpElmZMWGDlABk7MmmiabWVrmCGSv4r61iHO cfKHBq8Kcntjmp6DQ9Spi3dd5aa4k1MZsZL5HtnBcKDeMpzzkLDoRczJc3Y+HZaQ/+Vi gDsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=PDnPozsiQEH5D0xDZSajlCyRC9yDJg8xI1V4UhI3dBY=; b=VckkYlMD8onFdpym0LGaRWSTV6dRiRCcer1+Ox9bmIidYfs/FwxLWA1tIMLza4E3D/ ZeNdON6DrcnJYdHoOY9NnUPALx65UxIAt3OKXtMuMAqLO6WeX5VmgVljwz0w3NfY4Pzy OB2U6gq0rL20ffKhKLQM3WOQZ8OM1x87cZoSgGvkSZUtCBTnANs3junjSEVEvgbOTUIn iHVNx+rApOiBChMvY9myPJI1XJ1BSgFxuXg0JqmKYWtJtywx4JXiN7z384nmLiziUlYS uIlunlQFrbyZU0TD1zf77oGK+g8GuiHEQ0ocX/I9GpVXjFpW3yaeK4WlXgHh3GD1F/To hiew== X-Gm-Message-State: AOAM531w6oDKM9kjphDbOsxfUqm7euNVgP+J5eSxEipGtxbeSOsIh0Nw 4sU6SvyzvA4sLBxe71LxXi8= X-Google-Smtp-Source: ABdhPJyoNua+iudouW4Ab1EYIclehUZOXqnDVEAMQhNCE0Af3yx865NAGrt0kPVsxAcyUSChilK/IA== X-Received: by 2002:a63:43c4:: with SMTP id q187mr28769160pga.172.1625707129230; Wed, 07 Jul 2021 18:18:49 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:48 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 07/11] bpf: Relax verifier recursion check. Date: Wed, 7 Jul 2021 18:18:29 -0700 Message-Id: <20210708011833.67028-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov In the following bpf subprogram: static int timer_cb(void *map, void *key, void *value) { bpf_timer_set_callback(.., timer_cb); } the 'timer_cb' is a pointer to a function. ld_imm64 insn is used to carry this pointer. bpf_pseudo_func() returns true for such ld_imm64 insn. Unlike bpf_for_each_map_elem() the bpf_timer_set_callback() is asynchronous. Relax control flow check to allow such "recursion" that is seen as an infinite loop by check_cfg(). The distinction between bpf_for_each_map_elem() the bpf_timer_set_callback() is done in the follow up patch. Signed-off-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index cb393de3c818..1511f92b4cf4 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9463,8 +9463,12 @@ static int visit_func_call_insn(int t, int insn_cnt, init_explored_state(env, t + 1); if (visit_callee) { init_explored_state(env, t); - ret = push_insn(t, t + insns[t].imm + 1, BRANCH, - env, false); + ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env, + /* It's ok to allow recursion from CFG point of + * view. __check_func_call() will do the actual + * check. + */ + bpf_pseudo_func(insns + t)); } return ret; } From patchwork Thu Jul 8 01:18:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 472256 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E25A7C07E9C for ; Thu, 8 Jul 2021 01:18:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CBBC861CD0 for ; Thu, 8 Jul 2021 01:18:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230404AbhGHBVj (ORCPT ); Wed, 7 Jul 2021 21:21:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230343AbhGHBVd (ORCPT ); Wed, 7 Jul 2021 21:21:33 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F692C061764; Wed, 7 Jul 2021 18:18:51 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id f11so2094185plg.0; Wed, 07 Jul 2021 18:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IxE5DE3y2nt9vdXUjXFCPDE3BwXcH+IzGbMLLaHKMfk=; b=cTXI4VGpxRdoO2yUEX8YOkkq4guCME7vRHeT5LnoNCXUpETR+5rcHsOO5OFMn+fMsI xYxkd8xLALlQN3eCqfQCy4hrVbE6PWYxwwsJXvRxMbZBGnMEO3UvkUceBdrqXc7a215u sZUVsn0i9tBPzR68cPcS7f+qxYd2sruIC3Fu0T4Wi5B9bdjDzI57UeAiQb5/IF9RbyCM 0nWon3QmYZhPN38zYjoGQ9q9R/DQUhm2/cT6LP4OReHe/PzbrGh5bo9hjg+8whUVwh9w ajpkd06k+jffzI1Ju9f4+U8ee/B06q8C3JGpH+BzxD5fqRjIQ0FzcdlcqsJn1q8SDyCO NuJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IxE5DE3y2nt9vdXUjXFCPDE3BwXcH+IzGbMLLaHKMfk=; b=boJQyPNIgJk52nBD6+FQnzFUv0HHK/8fv7PPKO5jUUjJkkYfxtsah9NMd+rV2irlEV vrdYmAFt4sCh+TvnisOa3UZG7NUmCL5q45a4XZQ06+odUYxV64Snl8xQP3fnBS2r7ETh wzU2cKYjYUydcj1s4Z+j1IdYgOoTQiLw3t5yQScTIEcUNMWdTU1u2sHk59GAE4YZ1Uqa t4UCanYHCD6sn2J0V1PZfEKZ7RFjygzkFDcA5g/Lk1A56vSngw+4dkQclzX/JB+3j51w oRh/5D4IwX3UEHHcWq6B6zgQJROlV/vfjCd5T2k+VA7AbPAq7KZ+7MJfB8Ir1Xklz2ne TNrQ== X-Gm-Message-State: AOAM532TQ/V26ARAnLJhV63Z/ky2z1VD394LGVzl7GeW4ho/r2LiQp92 k4Jqev0Xh4fDqJXvdlqKAYY= X-Google-Smtp-Source: ABdhPJycwr+R0Dse2ZFUofw9Z74Zbg3FssMnjEDZeM1gyserD+1K6AQaWQ6J3WG+a9LRWx6E5PlvRg== X-Received: by 2002:a17:90b:2411:: with SMTP id nr17mr6466729pjb.153.1625707130875; Wed, 07 Jul 2021 18:18:50 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:50 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 08/11] bpf: Implement verifier support for validation of async callbacks. Date: Wed, 7 Jul 2021 18:18:30 -0700 Message-Id: <20210708011833.67028-9-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov bpf_for_each_map_elem() and bpf_timer_set_callback() helpers are relying on PTR_TO_FUNC infra in the verifier to validate addresses to subprograms and pass them into the helpers as function callbacks. In case of bpf_for_each_map_elem() the callback is invoked synchronously and the verifier treats it as a normal subprogram call by adding another bpf_func_state and new frame in __check_func_call(). bpf_timer_set_callback() doesn't invoke the callback directly. The subprogram will be called asynchronously from bpf_timer_cb(). Teach the verifier to validate such async callbacks as special kind of jump by pushing verifier state into stack and let pop_stack() process it. Special care needs to be taken during state pruning. The call insn doing bpf_timer_set_callback has to be a prune_point. Otherwise short timer callbacks might not have prune points in front of bpf_timer_set_callback() which means is_state_visited() will be called after this call insn is processed in __check_func_call(). Which means that another async_cb state will be pushed to be walked later and the verifier will eventually hit BPF_COMPLEXITY_LIMIT_JMP_SEQ limit. Since push_async_cb() looks like another push_stack() branch the infinite loop detection will trigger false positive. To recognize this case mark such states as in_async_callback_fn. To distinguish infinite loop in async callback vs the same callback called with different arguments for different map and timer add async_entry_cnt to bpf_func_state. Enforce return zero from async callbacks. Signed-off-by: Alexei Starovoitov --- include/linux/bpf_verifier.h | 9 ++- kernel/bpf/helpers.c | 8 +-- kernel/bpf/verifier.c | 123 ++++++++++++++++++++++++++++++++++- 3 files changed, 131 insertions(+), 9 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 5d3169b57e6e..242d0b1a0772 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -208,12 +208,19 @@ struct bpf_func_state { * zero == main subprog */ u32 subprogno; + /* Every bpf_timer_start will increment async_entry_cnt. + * It's used to distinguish: + * void foo(void) { for(;;); } + * void foo(void) { bpf_timer_set_callback(,foo); } + */ + u32 async_entry_cnt; + bool in_callback_fn; + bool in_async_callback_fn; /* The following fields should be last. See copy_func_state() */ int acquired_refs; struct bpf_reference_state *refs; int allocated_stack; - bool in_callback_fn; struct bpf_stack_state *stack; }; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 8f44b8c23543..9a1e554f7cab 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1043,7 +1043,6 @@ static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer) void *callback_fn; void *key; u32 idx; - int ret; callback_fn = rcu_dereference_check(t->callback_fn, rcu_read_lock_bh_held()); if (!callback_fn) @@ -1066,10 +1065,9 @@ static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer) key = value - round_up(map->key_size, 8); } - ret = BPF_CAST_CALL(callback_fn)((u64)(long)map, - (u64)(long)key, - (u64)(long)value, 0, 0); - WARN_ON(ret != 0); /* Next patch moves this check into the verifier */ + BPF_CAST_CALL(callback_fn)((u64)(long)map, (u64)(long)key, + (u64)(long)value, 0, 0); + /* The verifier checked that return value is zero. */ this_cpu_write(hrtimer_running, NULL); out: diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1511f92b4cf4..ab6ce598a652 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -735,6 +735,10 @@ static void print_verifier_state(struct bpf_verifier_env *env, if (state->refs[i].id) verbose(env, ",%d", state->refs[i].id); } + if (state->in_callback_fn) + verbose(env, " cb"); + if (state->in_async_callback_fn) + verbose(env, " async_cb"); verbose(env, "\n"); } @@ -1527,6 +1531,54 @@ static void init_func_state(struct bpf_verifier_env *env, init_reg_state(env, state); } +/* Similar to push_stack(), but for async callbacks */ +static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env, + int insn_idx, int prev_insn_idx, + int subprog) +{ + struct bpf_verifier_stack_elem *elem; + struct bpf_func_state *frame; + + elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL); + if (!elem) + goto err; + + elem->insn_idx = insn_idx; + elem->prev_insn_idx = prev_insn_idx; + elem->next = env->head; + elem->log_pos = env->log.len_used; + env->head = elem; + env->stack_size++; + if (env->stack_size > BPF_COMPLEXITY_LIMIT_JMP_SEQ) { + verbose(env, + "The sequence of %d jumps is too complex for async cb.\n", + env->stack_size); + goto err; + } + /* Unlike push_stack() do not copy_verifier_state(). + * The caller state doesn't matter. + * This is async callback. It starts in a fresh stack. + * Initialize it similar to do_check_common(). + */ + elem->st.branches = 1; + frame = kzalloc(sizeof(*frame), GFP_KERNEL); + if (!frame) + goto err; + init_func_state(env, frame, + BPF_MAIN_FUNC /* callsite */, + 0 /* frameno within this callchain */, + subprog /* subprog number within this prog */); + elem->st.frame[0] = frame; + return &elem->st; +err: + free_verifier_state(env->cur_state, true); + env->cur_state = NULL; + /* pop all elements and return */ + while (!pop_stack(env, NULL, NULL, false)); + return NULL; +} + + enum reg_arg_type { SRC_OP, /* register is used as source operand */ DST_OP, /* register is used as destination operand */ @@ -5704,6 +5756,30 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn } } + if (insn->code == (BPF_JMP | BPF_CALL) && + insn->imm == BPF_FUNC_timer_set_callback) { + struct bpf_verifier_state *async_cb; + + /* there is no real recursion here. timer callbacks are async */ + async_cb = push_async_cb(env, env->subprog_info[subprog].start, + *insn_idx, subprog); + if (!async_cb) + return -EFAULT; + callee = async_cb->frame[0]; + callee->async_entry_cnt = caller->async_entry_cnt + 1; + + /* Convert bpf_timer_set_callback() args into timer callback args */ + err = set_callee_state_cb(env, caller, callee, *insn_idx); + if (err) + return err; + + clear_caller_saved_regs(env, caller->regs); + mark_reg_unknown(env, caller->regs, BPF_REG_0); + caller->regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG; + /* continue with next insn after call */ + return 0; + } + callee = kzalloc(sizeof(*callee), GFP_KERNEL); if (!callee) return -ENOMEM; @@ -5856,6 +5932,7 @@ static int set_timer_callback_state(struct bpf_verifier_env *env, /* unused */ __mark_reg_not_init(env, &callee->regs[BPF_REG_4]); __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); + callee->in_async_callback_fn = true; return 0; } @@ -9224,7 +9301,8 @@ static int check_return_code(struct bpf_verifier_env *env) struct tnum range = tnum_range(0, 1); enum bpf_prog_type prog_type = resolve_prog_type(env->prog); int err; - const bool is_subprog = env->cur_state->frame[0]->subprogno; + struct bpf_func_state *frame = env->cur_state->frame[0]; + const bool is_subprog = frame->subprogno; /* LSM and struct_ops func-ptr's return type could be "void" */ if (!is_subprog && @@ -9249,6 +9327,22 @@ static int check_return_code(struct bpf_verifier_env *env) } reg = cur_regs(env) + BPF_REG_0; + + if (frame->in_async_callback_fn) { + /* enforce return zero from async callbacks like timer */ + if (reg->type != SCALAR_VALUE) { + verbose(env, "In async callback the register R0 is not a known value (%s)\n", + reg_type_str[reg->type]); + return -EINVAL; + } + + if (!tnum_in(tnum_const(0), reg->var_off)) { + verbose_invalid_scalar(env, reg, &range, "async callback", "R0"); + return -EINVAL; + } + return 0; + } + if (is_subprog) { if (reg->type != SCALAR_VALUE) { verbose(env, "At subprogram exit the register R0 is not a scalar value (%s)\n", @@ -9496,6 +9590,13 @@ static int visit_insn(int t, int insn_cnt, struct bpf_verifier_env *env) return DONE_EXPLORING; case BPF_CALL: + if (insns[t].imm == BPF_FUNC_timer_set_callback) + /* Mark this call insn to trigger is_state_visited() check + * before call itself is processed by __check_func_call(). + * Otherwise new async state will be pushed for further + * exploration. + */ + init_explored_state(env, t); return visit_func_call_insn(t, insn_cnt, insns, env, insns[t].src_reg == BPF_PSEUDO_CALL); @@ -10503,9 +10604,25 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) states_cnt++; if (sl->state.insn_idx != insn_idx) goto next; + if (sl->state.branches) { - if (states_maybe_looping(&sl->state, cur) && - states_equal(env, &sl->state, cur)) { + struct bpf_func_state *frame = sl->state.frame[sl->state.curframe]; + + if (frame->in_async_callback_fn && + frame->async_entry_cnt != cur->frame[cur->curframe]->async_entry_cnt) { + /* Different async_entry_cnt means that the verifier is + * processing another entry into async callback. + * Seeing the same state is not an indication of infinite + * loop or infinite recursion. + * But finding the same state doesn't mean that it's safe + * to stop processing the current state. The previous state + * hasn't yet reached bpf_exit, since state.branches > 0. + * Checking in_async_callback_fn alone is not enough either. + * Since the verifier still needs to catch infinite loops + * inside async callbacks. + */ + } else if (states_maybe_looping(&sl->state, cur) && + states_equal(env, &sl->state, cur)) { verbose_linfo(env, insn_idx, "; "); verbose(env, "infinite loop detected at insn %d\n", insn_idx); return -EINVAL; From patchwork Thu Jul 8 01:18:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 472257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3BF3C11F66 for ; Thu, 8 Jul 2021 01:18:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A25B661CD0 for ; Thu, 8 Jul 2021 01:18:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230354AbhGHBVg (ORCPT ); Wed, 7 Jul 2021 21:21:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230367AbhGHBVe (ORCPT ); Wed, 7 Jul 2021 21:21:34 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E18E8C061767; Wed, 7 Jul 2021 18:18:52 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id b12so3953435pfv.6; Wed, 07 Jul 2021 18:18:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ca4Ou8sUVl4fj4o5nVk+7zGezdg9tbhobFr+rWBV8RA=; b=OMT2movfCcOCSZUXzp2MOTI6jUkKipUDA2B3Tx/SG+DU1PPU8eGIfLMT5AjfBL3TMp aJb1GJBah6SoJMdQUN0RsMPvMhKt7FEYNgEWAzG+4ameAZRmBSYOsEBuOMZnoyCpy0/j eYChHC4osL7EzgrrZk6mo43JZDeMVJbjr4W4rpUY0DcMCb16GHJtLaNYZuE91U0mp4UW Zho75PoNn1sLvzk5nmYOTCDqTPmzk05Qi8Zav18CMh8eBYnr9jgwjfveBN8q39whp4Fu 6a6Yxt5FErqpLVXe4gWXO76MBwNm9SGEUwBs3Cu2DAeNqZ7Uh8xBQ3JtxLC0zbS3nVQB oqGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ca4Ou8sUVl4fj4o5nVk+7zGezdg9tbhobFr+rWBV8RA=; b=p8AsWFkwZG5s3KJxvJ1uT9x0oifjucwfeoQZ8mbVfbOfLR92zUUOBoGlpla04P0FMH iAIfYINAUHwAIp3PQzH+7AtWQJEQ5K8TBgPqcz8TXaOI40gD8YBOpvHF/jv8dhQvHwKp f+uqB0x6eSbXH4mT2rXQlyD9zkB0/ubr1JBi7J9jmvl+Ukit0j6DPAm6m2DHgTZCKVNK cEI/9tVkWGyYzbo2UXZrfYPe/zYu21Eyp663m3qk2ryZouAo1i6OlZMfvOECL+cbvy6f V23nf2/zi4w+3zB69w6VZ3cxB6jK2go87WBvvBfASNnAcfM3h3G0nkIsLM8giGZ5ovP6 Mysw== X-Gm-Message-State: AOAM533Q6x6eiAQdZipjItuiWAKaNNhH5vjTVyJB9qGiTlUmR6p5wNGr pL2jqth2ykvZ7qGlbOALD0Q= X-Google-Smtp-Source: ABdhPJwAMQIlAuafsFd15vAkV4Wl5oQXyG6csSatFRTPuNdEBgeKKOWYL8KbooyWqNXhsbkJS4sV3Q== X-Received: by 2002:a63:f916:: with SMTP id h22mr29086762pgi.6.1625707132464; Wed, 07 Jul 2021 18:18:52 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.51 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:51 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 09/11] bpf: Teach stack depth check about async callbacks. Date: Wed, 7 Jul 2021 18:18:31 -0700 Message-Id: <20210708011833.67028-10-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Teach max stack depth checking algorithm about async callbacks that don't increase bpf program stack size. Also add sanity check that bpf_tail_call didn't sneak into async cb. It's impossible, since PTR_TO_CTX is not available in async cb, hence the program cannot contain bpf_tail_call(ctx,...); Signed-off-by: Alexei Starovoitov --- include/linux/bpf_verifier.h | 1 + kernel/bpf/verifier.c | 18 +++++++++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 242d0b1a0772..b847e1ccd10f 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -406,6 +406,7 @@ struct bpf_subprog_info { bool has_tail_call; bool tail_call_reachable; bool has_ld_abs; + bool is_async_cb; }; /* single container for all structs diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ab6ce598a652..84f67580ab19 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3709,6 +3709,8 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) continue_func: subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { + int next_insn; + if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i)) continue; /* remember insn and function to return to */ @@ -3716,13 +3718,22 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) ret_prog[frame] = idx; /* find the callee */ - i = i + insn[i].imm + 1; - idx = find_subprog(env, i); + next_insn = i + insn[i].imm + 1; + idx = find_subprog(env, next_insn); if (idx < 0) { WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", - i); + next_insn); return -EFAULT; } + if (subprog[idx].is_async_cb) { + if (subprog[idx].has_tail_call) { + verbose(env, "verifier bug. subprog has tail_call and async cb\n"); + return -EFAULT; + } + /* async callbacks don't increase bpf prog stack size */ + continue; + } + i = next_insn; if (subprog[idx].has_tail_call) tail_call_reachable = true; @@ -5761,6 +5772,7 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn struct bpf_verifier_state *async_cb; /* there is no real recursion here. timer callbacks are async */ + env->subprog_info[subprog].is_async_cb = true; async_cb = push_async_cb(env, env->subprog_info[subprog].start, *insn_idx, subprog); if (!async_cb) From patchwork Thu Jul 8 01:18:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 471687 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A47CCC07E95 for ; Thu, 8 Jul 2021 01:19:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D27F61CC4 for ; Thu, 8 Jul 2021 01:19:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230401AbhGHBVk (ORCPT ); Wed, 7 Jul 2021 21:21:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230349AbhGHBVg (ORCPT ); Wed, 7 Jul 2021 21:21:36 -0400 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8CC1C061574; Wed, 7 Jul 2021 18:18:54 -0700 (PDT) Received: by mail-pj1-x102a.google.com with SMTP id x21-20020a17090aa395b029016e25313bfcso2774912pjp.2; Wed, 07 Jul 2021 18:18:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BJa6mry8J0MwiXrptb/iORsTljLvfzWlcZ7RwWWwnMg=; b=ECUs4vwd/c+D7k1A1Z5HAToiuobz2CVK/k6jTcFr7DwPt0iklBejYXIuywcywAxu11 g7snkEXv2uPLDhZh/pnPpVMiyAFlMLFwONjbRP/HdTmsJEsC8BMukl4254pWR1xb0lcN fCXSBdIrMMRU1DujXI7OIqxrw/A9yWIdUrPklA+pv4RI1qJRkzWqcKeag0zGgORmNbhU SclBgep51PhDt7zrehuwNc0otxW9OWH9Z8ADZZhItbzvveZMFXLPb4U/J7j1XgqYZYN4 YxPhC62S3qcYPSEEg5u/J+45jmVmXvzCbpys6K0qDzIegTm18BGP+Rlzl22KOiGb/ZH0 lsgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BJa6mry8J0MwiXrptb/iORsTljLvfzWlcZ7RwWWwnMg=; b=LQl7fcF4dLNCAq1cm0gYr1R+/jORckX88ikKhYTqf84vo2+iWURIwrg68JtjYw2UYe 5c4IR/0yeC+PEgYrlYnTbjo+zpHAQrP4KKKtS3qRfYrER9u6ioVnY5U1SSQkKs0dN5tQ vlAg1X81htupXqjB/I5GbbsCNYTXqf1SR7i5UuWlLw20NQLlCVCj92N6asm1j1kRWK+w XDKEfxAaw2sxwCOxSQrYoFJ/YoWSXT7Js3rlaoHkll0pFoZUv0Az8eexkDQEUtgRX7JK uqDaiC0WvKES8RPMPj+a2MseGne0RdkyD0y72SLDFfJGWCymsDNkdh/Vv/3qx8MBQuVN hXjQ== X-Gm-Message-State: AOAM532KT4Ng3DkrcVLA3+lZ1DRSpx6bfSgNeaJMTh+mueLwWWVmhAXJ N30kLvs7L7qTCkTHrBFw48Y= X-Google-Smtp-Source: ABdhPJxcTOOIIWP7S7QSCJ/KKdat4ZZRmC2Z2NrabBJwotqSYfcHK8tsXY5BIKgAg4jhGT0Lxn4l9A== X-Received: by 2002:a17:90a:f986:: with SMTP id cq6mr29944700pjb.127.1625707134142; Wed, 07 Jul 2021 18:18:54 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:53 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 10/11] selftests/bpf: Add bpf_timer test. Date: Wed, 7 Jul 2021 18:18:32 -0700 Message-Id: <20210708011833.67028-11-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Add bpf_timer test that creates timers in preallocated and non-preallocated hash, in array and in lru maps. Let array timer expire once and then re-arm it for 35 seconds. Arm lru timer into the same callback. Then arm and re-arm hash timers 10 times each. At the last invocation of prealloc hash timer cancel the array timer. Force timer free via LRU eviction and direct bpf_map_delete_elem. Signed-off-by: Alexei Starovoitov --- .../testing/selftests/bpf/prog_tests/timer.c | 55 ++++ tools/testing/selftests/bpf/progs/timer.c | 297 ++++++++++++++++++ 2 files changed, 352 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/timer.c create mode 100644 tools/testing/selftests/bpf/progs/timer.c diff --git a/tools/testing/selftests/bpf/prog_tests/timer.c b/tools/testing/selftests/bpf/prog_tests/timer.c new file mode 100644 index 000000000000..25f40e1b9967 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/timer.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include +#include "timer.skel.h" + +static int timer(struct timer *timer_skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + + err = timer__attach(timer_skel); + if (!ASSERT_OK(err, "timer_attach")) + return err; + + ASSERT_EQ(timer_skel->data->callback_check, 52, "callback_check1"); + ASSERT_EQ(timer_skel->data->callback2_check, 52, "callback2_check1"); + + prog_fd = bpf_program__fd(timer_skel->progs.test1); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(retval, 0, "test_run"); + timer__detach(timer_skel); + + usleep(50); /* 10 usecs should be enough, but give it extra */ + /* check that timer_cb1() was executed 10+10 times */ + ASSERT_EQ(timer_skel->data->callback_check, 42, "callback_check2"); + ASSERT_EQ(timer_skel->data->callback2_check, 42, "callback2_check2"); + + /* check that timer_cb2() was executed twice */ + ASSERT_EQ(timer_skel->bss->bss_data, 10, "bss_data"); + + /* check that there were no errors in timer execution */ + ASSERT_EQ(timer_skel->bss->err, 0, "err"); + + /* check that code paths completed */ + ASSERT_EQ(timer_skel->bss->ok, 1 | 2 | 4, "ok"); + + return 0; +} + +void test_timer(void) +{ + struct timer *timer_skel = NULL; + int err; + + timer_skel = timer__open_and_load(); + if (!ASSERT_OK_PTR(timer_skel, "timer_skel_load")) + goto cleanup; + + err = timer(timer_skel); + ASSERT_OK(err, "timer"); +cleanup: + timer__destroy(timer_skel); +} diff --git a/tools/testing/selftests/bpf/progs/timer.c b/tools/testing/selftests/bpf/progs/timer.c new file mode 100644 index 000000000000..5f5309791649 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/timer.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include +#include +#include +#include +#include "bpf_tcp_helpers.h" + +char _license[] SEC("license") = "GPL"; +struct hmap_elem { + int counter; + struct bpf_timer timer; + struct bpf_spin_lock lock; /* unused */ +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 1000); + __type(key, int); + __type(value, struct hmap_elem); +} hmap SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(map_flags, BPF_F_NO_PREALLOC); + __uint(max_entries, 1000); + __type(key, int); + __type(value, struct hmap_elem); +} hmap_malloc SEC(".maps"); + +struct elem { + struct bpf_timer t; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 2); + __type(key, int); + __type(value, struct elem); +} array SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_LRU_HASH); + __uint(max_entries, 4); + __type(key, int); + __type(value, struct elem); +} lru SEC(".maps"); + +__u64 bss_data; +__u64 err; +__u64 ok; +__u64 callback_check = 52; +__u64 callback2_check = 52; + +#define ARRAY 1 +#define HTAB 2 +#define HTAB_MALLOC 3 +#define LRU 4 + +/* callback for array and lru timers */ +static int timer_cb1(void *map, int *key, struct bpf_timer *timer) +{ + /* increment bss variable twice. + * Once via array timer callback and once via lru timer callback + */ + bss_data += 5; + + /* *key == 0 - the callback was called for array timer. + * *key == 4 - the callback was called from lru timer. + */ + if (*key == ARRAY) { + struct bpf_timer *lru_timer; + int lru_key = LRU; + + /* rearm array timer to be called again in ~35 seconds */ + if (bpf_timer_start(timer, 1ull << 35, 0) != 0) + err |= 1; + + lru_timer = bpf_map_lookup_elem(&lru, &lru_key); + if (!lru_timer) + return 0; + bpf_timer_set_callback(lru_timer, timer_cb1); + if (bpf_timer_start(lru_timer, 0, 0) != 0) + err |= 2; + } else if (*key == LRU) { + int lru_key, i; + + for (i = LRU + 1; + i <= 100 /* for current LRU eviction algorithm this number + * should be larger than ~ lru->max_entries * 2 + */; + i++) { + struct elem init = {}; + + /* lru_key cannot be used as loop induction variable + * otherwise the loop will be unbounded. + */ + lru_key = i; + + /* add more elements into lru map to push out current + * element and force deletion of this timer + */ + bpf_map_update_elem(map, &lru_key, &init, 0); + /* look it up to bump it into active list */ + bpf_map_lookup_elem(map, &lru_key); + + /* keep adding until *key changes underneath, + * which means that key/timer memory was reused + */ + if (*key != LRU) + break; + } + + /* check that the timer was removed */ + if (bpf_timer_cancel(timer) != -EINVAL) + err |= 4; + ok |= 1; + } + return 0; +} + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(test1, int a) +{ + struct bpf_timer *arr_timer, *lru_timer; + struct elem init = {}; + int lru_key = LRU; + int array_key = ARRAY; + + arr_timer = bpf_map_lookup_elem(&array, &array_key); + if (!arr_timer) + return 0; + bpf_timer_init(arr_timer, &array, CLOCK_MONOTONIC); + + bpf_map_update_elem(&lru, &lru_key, &init, 0); + lru_timer = bpf_map_lookup_elem(&lru, &lru_key); + if (!lru_timer) + return 0; + bpf_timer_init(lru_timer, &lru, CLOCK_MONOTONIC); + + bpf_timer_set_callback(arr_timer, timer_cb1); + bpf_timer_start(arr_timer, 0 /* call timer_cb1 asap */, 0); + + /* init more timers to check that array destruction + * doesn't leak timer memory. + */ + array_key = 0; + arr_timer = bpf_map_lookup_elem(&array, &array_key); + if (!arr_timer) + return 0; + bpf_timer_init(arr_timer, &array, CLOCK_MONOTONIC); + return 0; +} + +/* callback for prealloc and non-prealloca hashtab timers */ +static int timer_cb2(void *map, int *key, struct hmap_elem *val) +{ + if (*key == HTAB) + callback_check--; + else + callback2_check--; + if (val->counter > 0 && --val->counter) { + /* re-arm the timer again to execute after 1 usec */ + bpf_timer_start(&val->timer, 1000, 0); + } else if (*key == HTAB) { + struct bpf_timer *arr_timer; + int array_key = ARRAY; + + /* cancel arr_timer otherwise bpf_fentry_test1 prog + * will stay alive forever. + */ + arr_timer = bpf_map_lookup_elem(&array, &array_key); + if (!arr_timer) + return 0; + if (bpf_timer_cancel(arr_timer) != 1) + /* bpf_timer_cancel should return 1 to indicate + * that arr_timer was active at this time + */ + err |= 8; + + /* try to cancel ourself. It shouldn't deadlock. */ + if (bpf_timer_cancel(&val->timer) != -EDEADLK) + err |= 16; + + /* delete this key and this timer anyway. + * It shouldn't deadlock either. + */ + bpf_map_delete_elem(map, key); + + /* in preallocated hashmap both 'key' and 'val' could have been + * reused to store another map element (like in LRU above), + * but in controlled test environment the below test works. + * It's not a use-after-free. The memory is owned by the map. + */ + if (bpf_timer_start(&val->timer, 1000, 0) != -EINVAL) + err |= 32; + ok |= 2; + } else { + if (*key != HTAB_MALLOC) + err |= 64; + + /* try to cancel ourself. It shouldn't deadlock. */ + if (bpf_timer_cancel(&val->timer) != -EDEADLK) + err |= 128; + + /* delete this key and this timer anyway. + * It shouldn't deadlock either. + */ + bpf_map_delete_elem(map, key); + + /* in non-preallocated hashmap both 'key' and 'val' are RCU + * protected and still valid though this element was deleted + * from the map. Arm this timer for ~35 seconds. When callback + * finishes the call_rcu will invoke: + * htab_elem_free_rcu + * check_and_free_timer + * bpf_timer_cancel_and_free + * to cancel this 35 second sleep and delete the timer for real. + */ + if (bpf_timer_start(&val->timer, 1ull << 35, 0) != 0) + err |= 256; + ok |= 4; + } + return 0; +} + +int bpf_timer_test(void) +{ + struct hmap_elem *val; + int key = HTAB, key_malloc = HTAB_MALLOC; + + val = bpf_map_lookup_elem(&hmap, &key); + if (val) { + if (bpf_timer_init(&val->timer, &hmap, CLOCK_BOOTTIME) != 0) + err |= 512; + bpf_timer_set_callback(&val->timer, timer_cb2); + bpf_timer_start(&val->timer, 1000, 0); + } + val = bpf_map_lookup_elem(&hmap_malloc, &key_malloc); + if (val) { + if (bpf_timer_init(&val->timer, &hmap_malloc, CLOCK_BOOTTIME) != 0) + err |= 1024; + bpf_timer_set_callback(&val->timer, timer_cb2); + bpf_timer_start(&val->timer, 1000, 0); + } + return 0; +} + +SEC("fentry/bpf_fentry_test2") +int BPF_PROG(test2, int a, int b) +{ + struct hmap_elem init = {}, *val; + int key = HTAB, key_malloc = HTAB_MALLOC; + + init.counter = 10; /* number of times to trigger timer_cb2 */ + bpf_map_update_elem(&hmap, &key, &init, 0); + val = bpf_map_lookup_elem(&hmap, &key); + if (val) + bpf_timer_init(&val->timer, &hmap, CLOCK_BOOTTIME); + /* update the same key to free the timer */ + bpf_map_update_elem(&hmap, &key, &init, 0); + + bpf_map_update_elem(&hmap_malloc, &key_malloc, &init, 0); + val = bpf_map_lookup_elem(&hmap_malloc, &key_malloc); + if (val) + bpf_timer_init(&val->timer, &hmap_malloc, CLOCK_BOOTTIME); + /* update the same key to free the timer */ + bpf_map_update_elem(&hmap_malloc, &key_malloc, &init, 0); + + /* init more timers to check that htab operations + * don't leak timer memory. + */ + key = 0; + bpf_map_update_elem(&hmap, &key, &init, 0); + val = bpf_map_lookup_elem(&hmap, &key); + if (val) + bpf_timer_init(&val->timer, &hmap, CLOCK_BOOTTIME); + bpf_map_delete_elem(&hmap, &key); + bpf_map_update_elem(&hmap, &key, &init, 0); + val = bpf_map_lookup_elem(&hmap, &key); + if (val) + bpf_timer_init(&val->timer, &hmap, CLOCK_BOOTTIME); + + /* and with non-prealloc htab */ + key_malloc = 0; + bpf_map_update_elem(&hmap_malloc, &key_malloc, &init, 0); + val = bpf_map_lookup_elem(&hmap_malloc, &key_malloc); + if (val) + bpf_timer_init(&val->timer, &hmap_malloc, CLOCK_BOOTTIME); + bpf_map_delete_elem(&hmap_malloc, &key_malloc); + bpf_map_update_elem(&hmap_malloc, &key_malloc, &init, 0); + val = bpf_map_lookup_elem(&hmap_malloc, &key_malloc); + if (val) + bpf_timer_init(&val->timer, &hmap_malloc, CLOCK_BOOTTIME); + + return bpf_timer_test(); +} From patchwork Thu Jul 8 01:18:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 472255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA05BC07E9B for ; Thu, 8 Jul 2021 01:19:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 940BB61206 for ; Thu, 8 Jul 2021 01:19:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230427AbhGHBVl (ORCPT ); Wed, 7 Jul 2021 21:21:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230379AbhGHBVi (ORCPT ); Wed, 7 Jul 2021 21:21:38 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42E23C06175F; Wed, 7 Jul 2021 18:18:56 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id p14-20020a17090ad30eb02901731c776526so2538330pju.4; Wed, 07 Jul 2021 18:18:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WCM0qUYaCgBYrhbOFp35Di1REgdmWKUNaHQD3eXE5yY=; b=JhzmZ1HTtgnrRj/jekqq8jXS8t1nmoVJN1uRXhX1uyLgO1nYujhohEq+MV290HR61j xLFLaoDWaH+kt4mivymIVGrFqq/esnnfiL3I57bZchfqpo1sfnBoGgv6M8h+2jikpjGR 0IubxruLepB5JRzkym52pdX8zhasGKgxDEhBQUbvoaSW14mPA1v3GSBOGDCBZSA5Y63/ fGvBYd38b7/XeLevUQhP6zR10GMTKbLl4dKZZpU1H/zk5gMkNkIqew6KDVEJaOnm/rRP ORpotVS00XT5Hx8pX9+mEUH7i8u3SubJMTnrYqcbeTMMSKPGXyaDokzR8paH2R2SJeiW bcXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WCM0qUYaCgBYrhbOFp35Di1REgdmWKUNaHQD3eXE5yY=; b=pnbN7CMMHs/gvpSKbp6eivRG/7EPIY6yvgKaKd6eN6nxDSIHgwuYPhw9o1fJ6yXXJJ pdvGtmsA41vbBfBFZC0+JHUcBakGu+em1uAZG9WlcnSZV4NOlMty6z/44D80TGqfzeoy 9ipdX/cB6YFmEtjsxWV0rUsRR3jDdwHcWDxTJjSDrjiQxmJ6axdEbufVJ38SkOBrWwoG CX9kWAsYZUW8gJnboqR8zSH0ab7zb5AVVenKTzLgj1lZlivHKtIYnnJnyEGElwIzKF02 O1POPlXEJ7CK/MTEr86EJavSfavNGX+ZL9/bTKNwFVJIHawYKY7pMs1hX9LaM/XDZfRg LaLQ== X-Gm-Message-State: AOAM530o/05HDlEdhmj8TGz4vyR23vWw4ZUod9GhC4uumcXObtJyoB0l DHqJVFSLvRMrpWDtYdNeC3E= X-Google-Smtp-Source: ABdhPJz2fRwRc8+xbkhDN9nu3D+Ahud/FpDZpR7p9/aqYB3dWAD4bKUmKz1SixVVqK+0cZ6kUSkJKA== X-Received: by 2002:a17:902:edc6:b029:129:a9a8:67fa with SMTP id q6-20020a170902edc6b0290129a9a867famr9794346plk.38.1625707135777; Wed, 07 Jul 2021 18:18:55 -0700 (PDT) Received: from ast-mbp.thefacebook.com ([2620:10d:c090:400::5:9f4e]) by smtp.gmail.com with ESMTPSA id d20sm417450pfn.219.2021.07.07.18.18.54 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jul 2021 18:18:55 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v5 bpf-next 11/11] selftests/bpf: Add a test with bpf_timer in inner map. Date: Wed, 7 Jul 2021 18:18:33 -0700 Message-Id: <20210708011833.67028-12-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20210708011833.67028-1-alexei.starovoitov@gmail.com> References: <20210708011833.67028-1-alexei.starovoitov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexei Starovoitov Check that map-in-map supports bpf timers. Check that indirect "recursion" of timer callbacks works: timer_cb1() { bpf_timer_set_callback(timer_cb2); } timer_cb2() { bpf_timer_set_callback(timer_cb1); } Check that bpf_map_release htab_free_prealloced_timers bpf_timer_cancel_and_free hrtimer_cancel works while timer cb is running. "while true; do ./test_progs -t timer_mim; done" is a great stress test. It caught missing timer cancel in htab->extra_elems. timer_mim_reject.c is a negative test that checks that timer<->map mismatch is prevented. Signed-off-by: Alexei Starovoitov --- .../selftests/bpf/prog_tests/timer_mim.c | 69 +++++++++++++++ tools/testing/selftests/bpf/progs/timer_mim.c | 88 +++++++++++++++++++ .../selftests/bpf/progs/timer_mim_reject.c | 74 ++++++++++++++++ 3 files changed, 231 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/timer_mim.c create mode 100644 tools/testing/selftests/bpf/progs/timer_mim.c create mode 100644 tools/testing/selftests/bpf/progs/timer_mim_reject.c diff --git a/tools/testing/selftests/bpf/prog_tests/timer_mim.c b/tools/testing/selftests/bpf/prog_tests/timer_mim.c new file mode 100644 index 000000000000..f5acbcbe33a4 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/timer_mim.c @@ -0,0 +1,69 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include +#include "timer_mim.skel.h" +#include "timer_mim_reject.skel.h" + +static int timer_mim(struct timer_mim *timer_skel) +{ + __u32 duration = 0, retval; + __u64 cnt1, cnt2; + int err, prog_fd, key1 = 1; + + err = timer_mim__attach(timer_skel); + if (!ASSERT_OK(err, "timer_attach")) + return err; + + prog_fd = bpf_program__fd(timer_skel->progs.test1); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(retval, 0, "test_run"); + timer_mim__detach(timer_skel); + + /* check that timer_cb[12] are incrementing 'cnt' */ + cnt1 = READ_ONCE(timer_skel->bss->cnt); + usleep(200); /* 100 times more than interval */ + cnt2 = READ_ONCE(timer_skel->bss->cnt); + ASSERT_GT(cnt2, cnt1, "cnt"); + + ASSERT_EQ(timer_skel->bss->err, 0, "err"); + /* check that code paths completed */ + ASSERT_EQ(timer_skel->bss->ok, 1 | 2, "ok"); + + close(bpf_map__fd(timer_skel->maps.inner_htab)); + err = bpf_map_delete_elem(bpf_map__fd(timer_skel->maps.outer_arr), &key1); + ASSERT_EQ(err, 0, "delete inner map"); + + /* check that timer_cb[12] are no longer running */ + cnt1 = READ_ONCE(timer_skel->bss->cnt); + usleep(200); + cnt2 = READ_ONCE(timer_skel->bss->cnt); + ASSERT_EQ(cnt2, cnt1, "cnt"); + + return 0; +} + +void test_timer_mim(void) +{ + struct timer_mim_reject *timer_reject_skel = NULL; + libbpf_print_fn_t old_print_fn = NULL; + struct timer_mim *timer_skel = NULL; + int err; + + old_print_fn = libbpf_set_print(NULL); + timer_reject_skel = timer_mim_reject__open_and_load(); + libbpf_set_print(old_print_fn); + if (!ASSERT_ERR_PTR(timer_reject_skel, "timer_reject_skel_load")) + goto cleanup; + + timer_skel = timer_mim__open_and_load(); + if (!ASSERT_OK_PTR(timer_skel, "timer_skel_load")) + goto cleanup; + + err = timer_mim(timer_skel); + ASSERT_OK(err, "timer_mim"); +cleanup: + timer_mim__destroy(timer_skel); + timer_mim_reject__destroy(timer_reject_skel); +} diff --git a/tools/testing/selftests/bpf/progs/timer_mim.c b/tools/testing/selftests/bpf/progs/timer_mim.c new file mode 100644 index 000000000000..2fee7ab105ef --- /dev/null +++ b/tools/testing/selftests/bpf/progs/timer_mim.c @@ -0,0 +1,88 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include +#include +#include +#include +#include "bpf_tcp_helpers.h" + +char _license[] SEC("license") = "GPL"; +struct hmap_elem { + int pad; /* unused */ + struct bpf_timer timer; +}; + +struct inner_map { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 1024); + __type(key, int); + __type(value, struct hmap_elem); +} inner_htab SEC(".maps"); + +#define ARRAY_KEY 1 +#define HASH_KEY 1234 + +struct outer_arr { + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); + __uint(max_entries, 2); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __array(values, struct inner_map); +} outer_arr SEC(".maps") = { + .values = { [ARRAY_KEY] = &inner_htab }, +}; + +__u64 err; +__u64 ok; +__u64 cnt; + +static int timer_cb1(void *map, int *key, struct hmap_elem *val); + +static int timer_cb2(void *map, int *key, struct hmap_elem *val) +{ + cnt++; + bpf_timer_set_callback(&val->timer, timer_cb1); + if (bpf_timer_start(&val->timer, 1000, 0)) + err |= 1; + ok |= 1; + return 0; +} + +/* callback for inner hash map */ +static int timer_cb1(void *map, int *key, struct hmap_elem *val) +{ + cnt++; + bpf_timer_set_callback(&val->timer, timer_cb2); + if (bpf_timer_start(&val->timer, 1000, 0)) + err |= 2; + /* Do a lookup to make sure 'map' and 'key' pointers are correct */ + bpf_map_lookup_elem(map, key); + ok |= 2; + return 0; +} + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(test1, int a) +{ + struct hmap_elem init = {}; + struct bpf_map *inner_map; + struct hmap_elem *val; + int array_key = ARRAY_KEY; + int hash_key = HASH_KEY; + + inner_map = bpf_map_lookup_elem(&outer_arr, &array_key); + if (!inner_map) + return 0; + + bpf_map_update_elem(inner_map, &hash_key, &init, 0); + val = bpf_map_lookup_elem(inner_map, &hash_key); + if (!val) + return 0; + + bpf_timer_init(&val->timer, inner_map, CLOCK_MONOTONIC); + if (bpf_timer_set_callback(&val->timer, timer_cb1)) + err |= 4; + if (bpf_timer_start(&val->timer, 0, 0)) + err |= 8; + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/timer_mim_reject.c b/tools/testing/selftests/bpf/progs/timer_mim_reject.c new file mode 100644 index 000000000000..5d648e3d8a41 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/timer_mim_reject.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include +#include +#include +#include +#include "bpf_tcp_helpers.h" + +char _license[] SEC("license") = "GPL"; +struct hmap_elem { + int pad; /* unused */ + struct bpf_timer timer; +}; + +struct inner_map { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 1024); + __type(key, int); + __type(value, struct hmap_elem); +} inner_htab SEC(".maps"); + +#define ARRAY_KEY 1 +#define ARRAY_KEY2 2 +#define HASH_KEY 1234 + +struct outer_arr { + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); + __uint(max_entries, 2); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __array(values, struct inner_map); +} outer_arr SEC(".maps") = { + .values = { [ARRAY_KEY] = &inner_htab }, +}; + +__u64 err; +__u64 ok; +__u64 cnt; + +/* callback for inner hash map */ +static int timer_cb(void *map, int *key, struct hmap_elem *val) +{ + return 0; +} + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(test1, int a) +{ + struct hmap_elem init = {}; + struct bpf_map *inner_map, *inner_map2; + struct hmap_elem *val; + int array_key = ARRAY_KEY; + int array_key2 = ARRAY_KEY2; + int hash_key = HASH_KEY; + + inner_map = bpf_map_lookup_elem(&outer_arr, &array_key); + if (!inner_map) + return 0; + + inner_map2 = bpf_map_lookup_elem(&outer_arr, &array_key2); + if (!inner_map2) + return 0; + bpf_map_update_elem(inner_map, &hash_key, &init, 0); + val = bpf_map_lookup_elem(inner_map, &hash_key); + if (!val) + return 0; + + bpf_timer_init(&val->timer, inner_map2, CLOCK_MONOTONIC); + if (bpf_timer_set_callback(&val->timer, timer_cb)) + err |= 4; + if (bpf_timer_start(&val->timer, 0, 0)) + err |= 8; + return 0; +}