From patchwork Mon Mar 1 10:43:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 390780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03300C433E6 for ; Mon, 1 Mar 2021 10:45:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B28CD64DB2 for ; Mon, 1 Mar 2021 10:45:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234656AbhCAKow (ORCPT ); Mon, 1 Mar 2021 05:44:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234512AbhCAKoG (ORCPT ); Mon, 1 Mar 2021 05:44:06 -0500 Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D09A7C061786; Mon, 1 Mar 2021 02:43:25 -0800 (PST) Received: by mail-lj1-x236.google.com with SMTP id y12so5897306ljj.12; Mon, 01 Mar 2021 02:43:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=B6SlJ4/ky0MsdaBEbpjdDrtUEiElqT+bG5Rim5ZFqvk=; b=lfBRksQwxi13/AC5x4+OrTvaAy8PvHgvhDzTe0bww+envJ7th3Hz+4TGkzdbY14cWj uoXy4X9diQE387+orGRgLQUBUXM1/uIN4O40tei1u6FHBFA9kX+/BhiJcIO16Wd3kkC8 e2NuxtsnSDKA2TiWOzOn4uLeQh3GXbzcGRoryQm7i8RE5U4lG4STFQ1/pLHKf8C3cUaJ KGt8mK2f1+YY52tjDf3UmuA4LNj9r6AU1Y/scj3w2Mx6+LOojGE1nDg7MxGwEO7nipZR RI/bNE5NNpl5uH2wcqdMW6gtQftYkP39uL/OFERU+Oxbc3EY6BaDrkhjw1C1cvLZDrMG u4ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=B6SlJ4/ky0MsdaBEbpjdDrtUEiElqT+bG5Rim5ZFqvk=; b=rAnPT3b1nRD7vg0Phbvh1pVqzQVTaDoIGWkYe073Ow/oMtGn3JnRWWP8zUN7WzH/Tl K82HdRyME/wKf2n6/Tedtp/yzJVXpFGeQI7P2zxw3N/GOlgx2Q0flcf8EX4QLNynDW6C +Y1fbjHcXSGHxUdmwkMusxEAGXEMuPVM6rMRgW6oi+w3sWc8xeR8zOId7LIRNkLyz1rX +B+H8rqqChAFm+pWGOj/dLU9qNBSvQwEDg19Q/oiGFKcgx65Uiu1dTe427c8K0MBtZUw AsYxoC6NR9i3/53jaR8Hlkyr1UzAguXxvWdOHF/I7QAPmgQop6E73pbuPod3mt/la0Ap buIQ== X-Gm-Message-State: AOAM530hJmY/YWiGlO6mLQm/SDg28iAfQuxFQlrBodeuSkIyKFdfzTru DmoagbNLOtKUXiH4mGAh3io= X-Google-Smtp-Source: ABdhPJx+90rbVIFWq6OAnvEGS1xAzQsvY3n40seyB1rdotcOJXT/pGFTtmakFjcQSa4C3toOLExA3g== X-Received: by 2002:a2e:b522:: with SMTP id z2mr8848226ljm.416.1614595404363; Mon, 01 Mar 2021 02:43:24 -0800 (PST) Received: from btopel-mobl.ger.intel.com (c213-102-90-208.bredband.comhem.se. [213.102.90.208]) by smtp.gmail.com with ESMTPSA id w26sm2247492lfr.186.2021.03.01.02.43.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Mar 2021 02:43:23 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, jonathan.lemon@gmail.com, maximmi@nvidia.com, andrii@kernel.org Subject: [PATCH bpf-next 1/2] xsk: update rings for load-acquire/store-release semantics Date: Mon, 1 Mar 2021 11:43:17 +0100 Message-Id: <20210301104318.263262-2-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210301104318.263262-1-bjorn.topel@gmail.com> References: <20210301104318.263262-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Currently, the AF_XDP rings uses smp_{r,w,}mb() fences on the kernel-side. By updating the rings for load-acquire/store-release semantics, the full barrier on the consumer side can be replaced with improved performance as a nice side-effect. Note that this change does *not* require similar changes on the libbpf/userland side, however it is recommended [1]. On x86-64 systems, by removing the smp_mb() on the Rx and Tx side, the l2fwd AF_XDP xdpsock sample performance increases by 1%. Weakly-ordered platforms, such as ARM64 might benefit even more. [1] https://lore.kernel.org/bpf/20200316184423.GA14143@willie-the-truck/ Signed-off-by: Björn Töpel --- net/xdp/xsk_queue.h | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 2823b7c3302d..e24279d8d845 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -47,19 +47,18 @@ struct xsk_queue { u64 queue_empty_descs; }; -/* The structure of the shared state of the rings are the same as the - * ring buffer in kernel/events/ring_buffer.c. For the Rx and completion - * ring, the kernel is the producer and user space is the consumer. For - * the Tx and fill rings, the kernel is the consumer and user space is - * the producer. +/* The structure of the shared state of the rings are a simple + * circular buffer, as outlined in + * Documentation/core-api/circular-buffers.rst. For the Rx and + * completion ring, the kernel is the producer and user space is the + * consumer. For the Tx and fill rings, the kernel is the consumer and + * user space is the producer. * * producer consumer * - * if (LOAD ->consumer) { LOAD ->producer - * (A) smp_rmb() (C) + * if (LOAD ->consumer) { (A) LOAD.acq ->producer (C) * STORE $data LOAD $data - * smp_wmb() (B) smp_mb() (D) - * STORE ->producer STORE ->consumer + * STORE.rel ->producer (B) STORE.rel ->consumer (D) * } * * (A) pairs with (D), and (B) pairs with (C). @@ -227,15 +226,13 @@ static inline u32 xskq_cons_read_desc_batch(struct xsk_queue *q, static inline void __xskq_cons_release(struct xsk_queue *q) { - smp_mb(); /* D, matches A */ - WRITE_ONCE(q->ring->consumer, q->cached_cons); + smp_store_release(&q->ring->consumer, q->cached_cons); /* D, matchees A */ } static inline void __xskq_cons_peek(struct xsk_queue *q) { /* Refresh the local pointer */ - q->cached_prod = READ_ONCE(q->ring->producer); - smp_rmb(); /* C, matches B */ + q->cached_prod = smp_load_acquire(&q->ring->producer); /* C, matches B */ } static inline void xskq_cons_get_entries(struct xsk_queue *q) @@ -397,9 +394,7 @@ static inline int xskq_prod_reserve_desc(struct xsk_queue *q, static inline void __xskq_prod_submit(struct xsk_queue *q, u32 idx) { - smp_wmb(); /* B, matches C */ - - WRITE_ONCE(q->ring->producer, idx); + smp_store_release(&q->ring->producer, idx); /* B, matches C */ } static inline void xskq_prod_submit(struct xsk_queue *q) From patchwork Mon Mar 1 10:43:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 389027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58020C433E6 for ; Mon, 1 Mar 2021 10:45:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 18AB964E13 for ; Mon, 1 Mar 2021 10:45:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234623AbhCAKpD (ORCPT ); Mon, 1 Mar 2021 05:45:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234694AbhCAKoH (ORCPT ); Mon, 1 Mar 2021 05:44:07 -0500 Received: from mail-lj1-x22a.google.com (mail-lj1-x22a.google.com [IPv6:2a00:1450:4864:20::22a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26F12C061788; Mon, 1 Mar 2021 02:43:27 -0800 (PST) Received: by mail-lj1-x22a.google.com with SMTP id e2so11653027ljo.7; Mon, 01 Mar 2021 02:43:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jrR8mvyeezm9YVlR4aY3Pdm07zd5OUjsWfQdM4GMkIE=; b=EiN5yrdgNq5YGG4uBLMUOX219WwxH5kcLwiAPLi4+r3LGyQYWCVl+kIVXW3HoDKfN5 4eU4IwLRB3dUlWhcyGG99+8ySx5zyOaHM9FEjaUOBUHz1l0LBTytIJ8at2mSexWLVPXP K4ZfDgRbrYb8DF9I3MpnZo/o5LDbOrc8tQc1OqxA4saPZ5E5r4bP0xs64u4Pe8QByRTU /eXAqBD/sFLPCrSXQI/XB25UL/JrRbZNLwCqkakTpbPX1bc0iNfuiaO9c9KidajI99Ql XCKNQSPIuhxt8nJpSuVah2u1uWvXnJ33ELlEv+x8Nt/lX4QhxVI0LCxt9ehgyYxarZYM wLTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jrR8mvyeezm9YVlR4aY3Pdm07zd5OUjsWfQdM4GMkIE=; b=k2et3OvCL1W5tNp28Iv+v7AFMLJS8tnKINiIsDnzpN+yYMBCvZbxyeJFR8FyVXXVso SeYk95PdcgQ4vxDhCLmKYHq77YuCWN2MR3E5DR/pd6UuCoTP0noP05BvHZykCdXGyHFD gg0JrG3hQumV9L9dLErTIf2I9Lsw9bSKdMFgyF1Vh4M6mdZNipfShMH9vLA6nmB/XZCo XygaQEovinkB0fahYHLOK8D1rn3gZy3edqx/+PVy2Ka9DCe3f2D3pFzwxGkb++d9gbCv 9fR1uumTr/TvbpPxn/2QlDKiylNZNliW+24K7jsUi+cQwXWfppKyXPXyNp4iWi6ca9kf y7Lg== X-Gm-Message-State: AOAM531vCPxpubsu2JnxOUHJ6VyaqilWvUCrL/ptKP3i2NfnjTkxSNdV B0ZlqMkTJwFRJcAilR8pO2I= X-Google-Smtp-Source: ABdhPJyt1Er2THZ4KtoMfRS5teT3b5MB184Ed2+VHmhl0wbAXTIe8dl7WL4AlSyb+lFFiULYoJ1A8w== X-Received: by 2002:a05:651c:2113:: with SMTP id a19mr8892070ljq.147.1614595405635; Mon, 01 Mar 2021 02:43:25 -0800 (PST) Received: from btopel-mobl.ger.intel.com (c213-102-90-208.bredband.comhem.se. [213.102.90.208]) by smtp.gmail.com with ESMTPSA id w26sm2247492lfr.186.2021.03.01.02.43.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Mar 2021 02:43:25 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, jonathan.lemon@gmail.com, maximmi@nvidia.com, andrii@kernel.org Subject: [PATCH bpf-next 2/2] libbpf, xsk: add libbpf_smp_store_release libbpf_smp_load_acquire Date: Mon, 1 Mar 2021 11:43:18 +0100 Message-Id: <20210301104318.263262-3-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210301104318.263262-1-bjorn.topel@gmail.com> References: <20210301104318.263262-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Now that the AF_XDP rings have load-acquire/store-release semantics, move libbpf to that as well. The library-internal libbpf_smp_{load_acquire,store_release} are only valid for 32-bit words on ARM64. Also, remove the barriers that are no longer in use. Signed-off-by: Björn Töpel --- tools/lib/bpf/libbpf_util.h | 72 +++++++++++++++++++++++++------------ tools/lib/bpf/xsk.h | 17 +++------ 2 files changed, 55 insertions(+), 34 deletions(-) diff --git a/tools/lib/bpf/libbpf_util.h b/tools/lib/bpf/libbpf_util.h index 59c779c5790c..94a0d7bb6f3c 100644 --- a/tools/lib/bpf/libbpf_util.h +++ b/tools/lib/bpf/libbpf_util.h @@ -5,6 +5,7 @@ #define __LIBBPF_LIBBPF_UTIL_H #include +#include #ifdef __cplusplus extern "C" { @@ -15,29 +16,56 @@ extern "C" { * application that uses libbpf. */ #if defined(__i386__) || defined(__x86_64__) -# define libbpf_smp_rmb() asm volatile("" : : : "memory") -# define libbpf_smp_wmb() asm volatile("" : : : "memory") -# define libbpf_smp_mb() \ - asm volatile("lock; addl $0,-4(%%rsp)" : : : "memory", "cc") -/* Hinders stores to be observed before older loads. */ -# define libbpf_smp_rwmb() asm volatile("" : : : "memory") +# define libbpf_smp_store_release(p, v) \ + do { \ + asm volatile("" : : : "memory"); \ + WRITE_ONCE(*p, v); \ + } while (0) +# define libbpf_smp_load_acquire(p) \ + ({ \ + typeof(*p) ___p1 = READ_ONCE(*p); \ + asm volatile("" : : : "memory"); \ + ___p1; \ + }) #elif defined(__aarch64__) -# define libbpf_smp_rmb() asm volatile("dmb ishld" : : : "memory") -# define libbpf_smp_wmb() asm volatile("dmb ishst" : : : "memory") -# define libbpf_smp_mb() asm volatile("dmb ish" : : : "memory") -# define libbpf_smp_rwmb() libbpf_smp_mb() -#elif defined(__arm__) -/* These are only valid for armv7 and above */ -# define libbpf_smp_rmb() asm volatile("dmb ish" : : : "memory") -# define libbpf_smp_wmb() asm volatile("dmb ishst" : : : "memory") -# define libbpf_smp_mb() asm volatile("dmb ish" : : : "memory") -# define libbpf_smp_rwmb() libbpf_smp_mb() -#else -/* Architecture missing native barrier functions. */ -# define libbpf_smp_rmb() __sync_synchronize() -# define libbpf_smp_wmb() __sync_synchronize() -# define libbpf_smp_mb() __sync_synchronize() -# define libbpf_smp_rwmb() __sync_synchronize() +# define libbpf_smp_store_release(p, v) \ + asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory") +# define libbpf_smp_load_acquire(p) \ + ({ \ + typeof(*p) ___p1; \ + asm volatile ("ldar %w0, %1" \ + : "=r" (___p1) : "Q" (*p) : "memory"); \ + __p1; \ + }) +#elif defined(__riscv) +# define libbpf_smp_store_release(p, v) \ + do { \ + asm volatile ("fence rw,w" : : : "memory"); \ + WRITE_ONCE(*p, v); \ + } while (0) +# define libbpf_smp_load_acquire(p) \ + ({ \ + typeof(*p) ___p1 = READ_ONCE(*p); \ + asm volatile ("fence r,rw" : : : "memory"); \ + ___p1; \ + }) +#endif + +#ifndef libbpf_smp_store_release +#define libbpf_smp_store_release(p, v) \ + do { \ + __sync_synchronize(); \ + WRITE_ONCE(*p, v); \ + } while (0) +#endif + +#ifndef libbpf_smp_load_acquire +#define libbpf_smp_load_acquire(p) \ + ({ \ + typeof(*p) ___p1 = READ_ONCE(*p); \ + __sync_synchronize(); \ + ___p1; \ + }) #endif #ifdef __cplusplus diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h index e9f121f5d129..a9fdea87b5cd 100644 --- a/tools/lib/bpf/xsk.h +++ b/tools/lib/bpf/xsk.h @@ -96,7 +96,8 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb) * this function. Without this optimization it whould have been * free_entries = r->cached_prod - r->cached_cons + r->size. */ - r->cached_cons = *r->consumer + r->size; + r->cached_cons = libbpf_smp_load_acquire(r->consumer); + r->cached_cons += r->size; return r->cached_cons - r->cached_prod; } @@ -106,7 +107,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb) __u32 entries = r->cached_prod - r->cached_cons; if (entries == 0) { - r->cached_prod = *r->producer; + r->cached_prod = libbpf_smp_load_acquire(r->producer); entries = r->cached_prod - r->cached_cons; } @@ -129,9 +130,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb) /* Make sure everything has been written to the ring before indicating * this to the kernel by writing the producer pointer. */ - libbpf_smp_wmb(); - - *prod->producer += nb; + libbpf_smp_store_release(prod->producer, *prod->producer + nb); } static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx) @@ -139,11 +138,6 @@ static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __ __u32 entries = xsk_cons_nb_avail(cons, nb); if (entries > 0) { - /* Make sure we do not speculatively read the data before - * we have received the packet buffers from the ring. - */ - libbpf_smp_rmb(); - *idx = cons->cached_cons; cons->cached_cons += entries; } @@ -161,9 +155,8 @@ static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb) /* Make sure data has been read before indicating we are done * with the entries by updating the consumer pointer. */ - libbpf_smp_rwmb(); + libbpf_smp_store_release(cons->consumer, *cons->consumer + nb); - *cons->consumer += nb; } static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)