From patchwork Mon Oct 21 00:22:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 177007 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2676778ill; Sun, 20 Oct 2019 17:23:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqyA7/QqaQBhhYFfbMfq8C2zAGKgDOMXfEyOblIhUI2Jp0NTSBZQ6qsL4UdPgbt2++ms4R4C X-Received: by 2002:a50:fe0f:: with SMTP id f15mr21607380edt.89.1571617429244; Sun, 20 Oct 2019 17:23:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617429; cv=none; d=google.com; s=arc-20160816; b=sK73zWJG17gykiyxaKUNeYUTaen9O7z3pXBmnRmCN11mzxsE+5usVgFF58jXNEkzkH EAfLNuR4OZPC9gJqI66W6XSAy44/d084QzZzUlbw4oSiOa+Mzbj4FP9qpki/ADPNZaJR V2gwtumRcANxXUuoqSeRGtsP70malSnKnLohHEs9OV3bfD10IGG2rs6Ut38u6Bq5OfJS AaobROgA8s1g1yXASHIEASR0lsHaJMHCfpPGA+2fTCG5cqfexcfVURPiZK0fEoCIqOFJ M90jomOir9Vpnf/6vbO1IWE5i8e9//mF7ivOvsyFH8CdletGISIibBXmysqroYWucNno wBGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=+6cdn6GR89rZ2B1N4hRNBRD+UoRk286p2g0JpIcoKEw=; b=0tkbesDm7pv4Zl040CIlmiH0nMTHuJtOEg+vhk4GbKB4fdCJ6/ZNVusgPhZ27fUFeH hajBrun5K9yXmaQcg3WZbCLKS9JPpfp4xCeJTj4f+tGuXF4/vbSQGooSsn4nzb76+oIm vttQFymC8EyNWmp9RUx3vA624jIfOYJQwZRLsnO8kmsGykHd3uQMqREE5bbWPYqNDG6x WABrL014GL501z/9rIkCSPcDIc3LEFCGnWp764z/T3gFCLgStJ6AVk+e8l3fGJRD/l6y sqE8nFgWrDaKeRPeAWxj28LL1bIYUFRPzwWPFe1oK+v6IkVCW+Ehz0iM9bXlEM0r6Apm CsCA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id f13si9108037eda.277.2019.10.20.17.23.49; Sun, 20 Oct 2019 17:23:49 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D02301252; Mon, 21 Oct 2019 02:23:42 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 7487CF90 for ; Mon, 21 Oct 2019 02:23:40 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C43837B; Sun, 20 Oct 2019 17:23:31 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 099D43F71F; Sun, 20 Oct 2019 17:23:31 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:22:55 -0500 Message-Id: <20191021002300.26497-2-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 1/6] test/ring: use division for cycle count calculation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Use division instead of modulo operation to calculate more accurate cycle count. Signed-off-by: Honnappa Nagarahalli --- app/test/test_ring_perf.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) -- 2.17.1 Acked-by: Olivier Matz diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c index b6ad703bb..e3e17f251 100644 --- a/app/test/test_ring_perf.c +++ b/app/test/test_ring_perf.c @@ -284,10 +284,10 @@ test_single_enqueue_dequeue(struct rte_ring *r) } const uint64_t mc_end = rte_rdtsc(); - printf("SP/SC single enq/dequeue: %"PRIu64"\n", - (sc_end-sc_start) >> iter_shift); - printf("MP/MC single enq/dequeue: %"PRIu64"\n", - (mc_end-mc_start) >> iter_shift); + printf("SP/SC single enq/dequeue: %.2F\n", + ((double)(sc_end-sc_start)) / iterations); + printf("MP/MC single enq/dequeue: %.2F\n", + ((double)(mc_end-mc_start)) / iterations); } /* @@ -322,13 +322,15 @@ test_burst_enqueue_dequeue(struct rte_ring *r) } const uint64_t mc_end = rte_rdtsc(); - uint64_t mc_avg = ((mc_end-mc_start) >> iter_shift) / bulk_sizes[sz]; - uint64_t sc_avg = ((sc_end-sc_start) >> iter_shift) / bulk_sizes[sz]; + double mc_avg = ((double)(mc_end-mc_start) / iterations) / + bulk_sizes[sz]; + double sc_avg = ((double)(sc_end-sc_start) / iterations) / + bulk_sizes[sz]; - printf("SP/SC burst enq/dequeue (size: %u): %"PRIu64"\n", bulk_sizes[sz], - sc_avg); - printf("MP/MC burst enq/dequeue (size: %u): %"PRIu64"\n", bulk_sizes[sz], - mc_avg); + printf("SP/SC burst enq/dequeue (size: %u): %.2F\n", + bulk_sizes[sz], sc_avg); + printf("MP/MC burst enq/dequeue (size: %u): %.2F\n", + bulk_sizes[sz], mc_avg); } } From patchwork Mon Oct 21 00:22:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 177008 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2676908ill; Sun, 20 Oct 2019 17:23:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqxf4lqR7/Mc670Mkoh6ZTK5sTD7+ZWQY36vrukJFWpLrhs4tN/xXS9zSJE7v8OxTqL+do7b X-Received: by 2002:a05:6402:14d6:: with SMTP id f22mr22656945edx.148.1571617437606; Sun, 20 Oct 2019 17:23:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617437; cv=none; d=google.com; s=arc-20160816; b=V8+rwWWj+Mbaha/EQbZkogw3YNassJsRAwjTKBJ6l8viiaIpPOB4lnlEmd3hvCW+/l VEHxzcNj5Kb7WwB0XYZFcZSKPAv8tfgKuhDA4PlUMat7oHBai2JUiLzqs89XhNtUQkQF 1pP7sNNkHODADe0Zo0R3PrpbqcJr+hdpBKMY46ivOOIDjZqUMHSFTgT4hmOTNF/rZQni LQsUGX30aZKRkG8NYaJiQNvO7H6RTZqTaF8qOLtyCCltpuyrblsRMsle1xZrDjx35nbD tXWDlR/ByCl9T2uAeVKaXeONZgahnw+HpFeFmbgvjZvg/5U869HU59TVc/0TQjFQmO0i +HHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=YGssjAZEweJ8kMkRVMBrliB9Xb0D96a225ahnxjDYTk=; b=IFX9RbGRBNEn/CyidNWdcU3AyKmWdQJDAsgpZ2Etc6covXfl4uoqqTlxf0v+R/6Tuf jA7dZI729TZ6v8er47eBlze73N7wp7n8YEnajOsIXD346x1+YRdmYBWZOB4BsNCVATNT 65OIgV5U2Y7pmnpsoQ7fjrM32DAJYWtTVpD/Sl4cNXmseswJzXEkbDA9m2nWIeQQ9XEJ QZ6daAYKFXf7fDA3TL5fLbq9OEKRzRSZ93vopl3xd2YNbI+5YOyfhTZGhtjtRhhKotoS TndPLeAjVisJn4acEYUhj/nxyV7KgMrAGDSLRphrlgkNYAoa8H7X0eBo3ZDhhQdL1kFZ sLsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id ox10si7629232ejb.325.2019.10.20.17.23.57; Sun, 20 Oct 2019 17:23:57 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 22E141D9E; Mon, 21 Oct 2019 02:23:45 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 6A7AEF90 for ; Mon, 21 Oct 2019 02:23:41 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7182546A; Sun, 20 Oct 2019 17:23:32 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5A84B3F71F; Sun, 20 Oct 2019 17:23:32 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:22:56 -0500 Message-Id: <20191021002300.26497-3-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 2/6] lib/ring: apis to support configurable element size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Current APIs assume ring elements to be pointers. However, in many use cases, the size can be different. Add new APIs to support configurable ring element sizes. Signed-off-by: Honnappa Nagarahalli Reviewed-by: Dharmik Thakkar Reviewed-by: Gavin Hu Reviewed-by: Ruifeng Wang --- lib/librte_ring/Makefile | 3 +- lib/librte_ring/meson.build | 4 + lib/librte_ring/rte_ring.c | 44 +- lib/librte_ring/rte_ring.h | 1 + lib/librte_ring/rte_ring_elem.h | 946 +++++++++++++++++++++++++++ lib/librte_ring/rte_ring_version.map | 2 + 6 files changed, 991 insertions(+), 9 deletions(-) create mode 100644 lib/librte_ring/rte_ring_elem.h -- 2.17.1 diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile index 21a36770d..515a967bb 100644 --- a/lib/librte_ring/Makefile +++ b/lib/librte_ring/Makefile @@ -6,7 +6,7 @@ include $(RTE_SDK)/mk/rte.vars.mk # library name LIB = librte_ring.a -CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -DALLOW_EXPERIMENTAL_API LDLIBS += -lrte_eal EXPORT_MAP := rte_ring_version.map @@ -18,6 +18,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c # install includes SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \ + rte_ring_elem.h \ rte_ring_generic.h \ rte_ring_c11_mem.h diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build index ab8b0b469..7ebaba919 100644 --- a/lib/librte_ring/meson.build +++ b/lib/librte_ring/meson.build @@ -4,5 +4,9 @@ version = 2 sources = files('rte_ring.c') headers = files('rte_ring.h', + 'rte_ring_elem.h', 'rte_ring_c11_mem.h', 'rte_ring_generic.h') + +# rte_ring_create_elem and rte_ring_get_memsize_elem are experimental +allow_experimental_apis = true diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c index d9b308036..e95285259 100644 --- a/lib/librte_ring/rte_ring.c +++ b/lib/librte_ring/rte_ring.c @@ -33,6 +33,7 @@ #include #include "rte_ring.h" +#include "rte_ring_elem.h" TAILQ_HEAD(rte_ring_list, rte_tailq_entry); @@ -46,23 +47,41 @@ EAL_REGISTER_TAILQ(rte_ring_tailq) /* return the size of memory occupied by a ring */ ssize_t -rte_ring_get_memsize(unsigned count) +rte_ring_get_memsize_elem(unsigned count, unsigned esize) { ssize_t sz; + /* Supported esize values are 4/8/16. + * Others can be added on need basis. + */ + if (esize != 4 && esize != 8 && esize != 16) { + RTE_LOG(ERR, RING, + "Unsupported esize value. Supported values are 4, 8 and 16\n"); + + return -EINVAL; + } + /* count must be a power of 2 */ if ((!POWEROF2(count)) || (count > RTE_RING_SZ_MASK )) { RTE_LOG(ERR, RING, - "Requested size is invalid, must be power of 2, and " - "do not exceed the size limit %u\n", RTE_RING_SZ_MASK); + "Requested number of elements is invalid, must be power of 2, and not exceed %u\n", + RTE_RING_SZ_MASK); + return -EINVAL; } - sz = sizeof(struct rte_ring) + count * sizeof(void *); + sz = sizeof(struct rte_ring) + count * esize; sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); return sz; } +/* return the size of memory occupied by a ring */ +ssize_t +rte_ring_get_memsize(unsigned count) +{ + return rte_ring_get_memsize_elem(count, sizeof(void *)); +} + void rte_ring_reset(struct rte_ring *r) { @@ -114,10 +133,10 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count, return 0; } -/* create the ring */ +/* create the ring for a given element size */ struct rte_ring * -rte_ring_create(const char *name, unsigned count, int socket_id, - unsigned flags) +rte_ring_create_elem(const char *name, unsigned count, unsigned esize, + int socket_id, unsigned flags) { char mz_name[RTE_MEMZONE_NAMESIZE]; struct rte_ring *r; @@ -135,7 +154,7 @@ rte_ring_create(const char *name, unsigned count, int socket_id, if (flags & RING_F_EXACT_SZ) count = rte_align32pow2(count + 1); - ring_size = rte_ring_get_memsize(count); + ring_size = rte_ring_get_memsize_elem(count, esize); if (ring_size < 0) { rte_errno = ring_size; return NULL; @@ -182,6 +201,15 @@ rte_ring_create(const char *name, unsigned count, int socket_id, return r; } +/* create the ring */ +struct rte_ring * +rte_ring_create(const char *name, unsigned count, int socket_id, + unsigned flags) +{ + return rte_ring_create_elem(name, count, sizeof(void *), socket_id, + flags); +} + /* free the ring */ void rte_ring_free(struct rte_ring *r) diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 2a9f768a1..18fc5d845 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -216,6 +216,7 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned count, */ struct rte_ring *rte_ring_create(const char *name, unsigned count, int socket_id, unsigned flags); + /** * De-allocate all memory used by the ring. * diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h new file mode 100644 index 000000000..7e9914567 --- /dev/null +++ b/lib/librte_ring/rte_ring_elem.h @@ -0,0 +1,946 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 2019 Arm Limited + * Copyright (c) 2010-2017 Intel Corporation + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org + * All rights reserved. + * Derived from FreeBSD's bufring.h + * Used as BSD-3 Licensed with permission from Kip Macy. + */ + +#ifndef _RTE_RING_ELEM_H_ +#define _RTE_RING_ELEM_H_ + +/** + * @file + * RTE Ring with flexible element size + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "rte_ring.h" + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Calculate the memory size needed for a ring with given element size + * + * This function returns the number of bytes needed for a ring, given + * the number of elements in it and the size of the element. This value + * is the sum of the size of the structure rte_ring and the size of the + * memory needed for storing the elements. The value is aligned to a cache + * line size. + * + * @param count + * The number of elements in the ring (must be a power of 2). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. + * @return + * - The memory size needed for the ring on success. + * - -EINVAL if count is not a power of 2. + */ +__rte_experimental +ssize_t rte_ring_get_memsize_elem(unsigned int count, unsigned int esize); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Create a new ring named *name* that stores elements with given size. + * + * This function uses ``memzone_reserve()`` to allocate memory. Then it + * calls rte_ring_init() to initialize an empty ring. + * + * The new ring size is set to *count*, which must be a power of + * two. Water marking is disabled by default. The real usable ring size + * is *count-1* instead of *count* to differentiate a free ring from an + * empty ring. + * + * The ring is added in RTE_TAILQ_RING list. + * + * @param name + * The name of the ring. + * @param count + * The number of elements in the ring (must be a power of 2). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. + * @param socket_id + * The *socket_id* argument is the socket identifier in case of + * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA + * constraint for the reserved zone. + * @param flags + * An OR of the following: + * - RING_F_SP_ENQ: If this flag is set, the default behavior when + * using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()`` + * is "single-producer". Otherwise, it is "multi-producers". + * - RING_F_SC_DEQ: If this flag is set, the default behavior when + * using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()`` + * is "single-consumer". Otherwise, it is "multi-consumers". + * @return + * On success, the pointer to the new allocated ring. NULL on error with + * rte_errno set appropriately. Possible errno values include: + * - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure + * - E_RTE_SECONDARY - function was called from a secondary process instance + * - EINVAL - count provided is not a power of 2 + * - ENOSPC - the maximum number of memzones has already been allocated + * - EEXIST - a memzone with the same name already exists + * - ENOMEM - no appropriate memory area found in which to create memzone + */ +__rte_experimental +struct rte_ring *rte_ring_create_elem(const char *name, unsigned int count, + unsigned int esize, int socket_id, unsigned int flags); + +/* the actual enqueue of pointers on the ring. + * Placed here since identical code needed in both + * single and multi producer enqueue functions. + */ +#define ENQUEUE_PTRS_ELEM(r, ring_start, prod_head, obj_table, esize, n) do { \ + if (esize == 4) \ + ENQUEUE_PTRS_32(r, ring_start, prod_head, obj_table, n); \ + else if (esize == 8) \ + ENQUEUE_PTRS_64(r, ring_start, prod_head, obj_table, n); \ + else if (esize == 16) \ + ENQUEUE_PTRS_128(r, ring_start, prod_head, obj_table, n); \ +} while (0) + +#define ENQUEUE_PTRS_32(r, ring_start, prod_head, obj_table, n) do { \ + unsigned int i; \ + const uint32_t size = (r)->size; \ + uint32_t idx = prod_head & (r)->mask; \ + uint32_t *ring = (uint32_t *)ring_start; \ + uint32_t *obj = (uint32_t *)obj_table; \ + if (likely(idx + n < size)) { \ + for (i = 0; i < (n & ((~(uint32_t)0x7))); i += 8, idx += 8) { \ + ring[idx] = obj[i]; \ + ring[idx + 1] = obj[i + 1]; \ + ring[idx + 2] = obj[i + 2]; \ + ring[idx + 3] = obj[i + 3]; \ + ring[idx + 4] = obj[i + 4]; \ + ring[idx + 5] = obj[i + 5]; \ + ring[idx + 6] = obj[i + 6]; \ + ring[idx + 7] = obj[i + 7]; \ + } \ + switch (n & 0x7) { \ + case 7: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 6: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 5: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 4: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 3: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 2: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 1: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + } \ + } else { \ + for (i = 0; idx < size; i++, idx++)\ + ring[idx] = obj[i]; \ + for (idx = 0; i < n; i++, idx++) \ + ring[idx] = obj[i]; \ + } \ +} while (0) + +#define ENQUEUE_PTRS_64(r, ring_start, prod_head, obj_table, n) do { \ + unsigned int i; \ + const uint32_t size = (r)->size; \ + uint32_t idx = prod_head & (r)->mask; \ + uint64_t *ring = (uint64_t *)ring_start; \ + uint64_t *obj = (uint64_t *)obj_table; \ + if (likely(idx + n < size)) { \ + for (i = 0; i < (n & ((~(uint32_t)0x3))); i += 4, idx += 4) { \ + ring[idx] = obj[i]; \ + ring[idx + 1] = obj[i + 1]; \ + ring[idx + 2] = obj[i + 2]; \ + ring[idx + 3] = obj[i + 3]; \ + } \ + switch (n & 0x3) { \ + case 3: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 2: \ + ring[idx++] = obj[i++]; /* fallthrough */ \ + case 1: \ + ring[idx++] = obj[i++]; \ + } \ + } else { \ + for (i = 0; idx < size; i++, idx++)\ + ring[idx] = obj[i]; \ + for (idx = 0; i < n; i++, idx++) \ + ring[idx] = obj[i]; \ + } \ +} while (0) + +#define ENQUEUE_PTRS_128(r, ring_start, prod_head, obj_table, n) do { \ + unsigned int i; \ + const uint32_t size = (r)->size; \ + uint32_t idx = prod_head & (r)->mask; \ + __uint128_t *ring = (__uint128_t *)ring_start; \ + __uint128_t *obj = (__uint128_t *)obj_table; \ + if (likely(idx + n < size)) { \ + for (i = 0; i < (n >> 1); i += 2, idx += 2) { \ + ring[idx] = obj[i]; \ + ring[idx + 1] = obj[i + 1]; \ + } \ + switch (n & 0x1) { \ + case 1: \ + ring[idx++] = obj[i++]; \ + } \ + } else { \ + for (i = 0; idx < size; i++, idx++)\ + ring[idx] = obj[i]; \ + for (idx = 0; i < n; i++, idx++) \ + ring[idx] = obj[i]; \ + } \ +} while (0) + +/* the actual copy of pointers on the ring to obj_table. + * Placed here since identical code needed in both + * single and multi consumer dequeue functions. + */ +#define DEQUEUE_PTRS_ELEM(r, ring_start, cons_head, obj_table, esize, n) do { \ + if (esize == 4) \ + DEQUEUE_PTRS_32(r, ring_start, cons_head, obj_table, n); \ + else if (esize == 8) \ + DEQUEUE_PTRS_64(r, ring_start, cons_head, obj_table, n); \ + else if (esize == 16) \ + DEQUEUE_PTRS_128(r, ring_start, cons_head, obj_table, n); \ +} while (0) + +#define DEQUEUE_PTRS_32(r, ring_start, cons_head, obj_table, n) do { \ + unsigned int i; \ + uint32_t idx = cons_head & (r)->mask; \ + const uint32_t size = (r)->size; \ + uint32_t *ring = (uint32_t *)ring_start; \ + uint32_t *obj = (uint32_t *)obj_table; \ + if (likely(idx + n < size)) { \ + for (i = 0; i < (n & (~(uint32_t)0x7)); i += 8, idx += 8) {\ + obj[i] = ring[idx]; \ + obj[i + 1] = ring[idx + 1]; \ + obj[i + 2] = ring[idx + 2]; \ + obj[i + 3] = ring[idx + 3]; \ + obj[i + 4] = ring[idx + 4]; \ + obj[i + 5] = ring[idx + 5]; \ + obj[i + 6] = ring[idx + 6]; \ + obj[i + 7] = ring[idx + 7]; \ + } \ + switch (n & 0x7) { \ + case 7: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 6: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 5: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 4: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 3: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 2: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 1: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + } \ + } else { \ + for (i = 0; idx < size; i++, idx++) \ + obj[i] = ring[idx]; \ + for (idx = 0; i < n; i++, idx++) \ + obj[i] = ring[idx]; \ + } \ +} while (0) + +#define DEQUEUE_PTRS_64(r, ring_start, cons_head, obj_table, n) do { \ + unsigned int i; \ + uint32_t idx = cons_head & (r)->mask; \ + const uint32_t size = (r)->size; \ + uint64_t *ring = (uint64_t *)ring_start; \ + uint64_t *obj = (uint64_t *)obj_table; \ + if (likely(idx + n < size)) { \ + for (i = 0; i < (n & (~(uint32_t)0x3)); i += 4, idx += 4) {\ + obj[i] = ring[idx]; \ + obj[i + 1] = ring[idx + 1]; \ + obj[i + 2] = ring[idx + 2]; \ + obj[i + 3] = ring[idx + 3]; \ + } \ + switch (n & 0x3) { \ + case 3: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 2: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + case 1: \ + obj[i++] = ring[idx++]; \ + } \ + } else { \ + for (i = 0; idx < size; i++, idx++) \ + obj[i] = ring[idx]; \ + for (idx = 0; i < n; i++, idx++) \ + obj[i] = ring[idx]; \ + } \ +} while (0) + +#define DEQUEUE_PTRS_128(r, ring_start, cons_head, obj_table, n) do { \ + unsigned int i; \ + uint32_t idx = cons_head & (r)->mask; \ + const uint32_t size = (r)->size; \ + __uint128_t *ring = (__uint128_t *)ring_start; \ + __uint128_t *obj = (__uint128_t *)obj_table; \ + if (likely(idx + n < size)) { \ + for (i = 0; i < (n >> 1); i += 2, idx += 2) { \ + obj[i] = ring[idx]; \ + obj[i + 1] = ring[idx + 1]; \ + } \ + switch (n & 0x1) { \ + case 1: \ + obj[i++] = ring[idx++]; /* fallthrough */ \ + } \ + } else { \ + for (i = 0; idx < size; i++, idx++) \ + obj[i] = ring[idx]; \ + for (idx = 0; i < n; i++, idx++) \ + obj[i] = ring[idx]; \ + } \ +} while (0) + +/* Between load and load. there might be cpu reorder in weak model + * (powerpc/arm). + * There are 2 choices for the users + * 1.use rmb() memory barrier + * 2.use one-direction load_acquire/store_release barrier,defined by + * CONFIG_RTE_USE_C11_MEM_MODEL=y + * It depends on performance test results. + * By default, move common functions to rte_ring_generic.h + */ +#ifdef RTE_USE_C11_MEM_MODEL +#include "rte_ring_c11_mem.h" +#else +#include "rte_ring_generic.h" +#endif + +/** + * @internal Enqueue several objects on the ring + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param behavior + * RTE_RING_QUEUE_FIXED: Enqueue a fixed number of items from a ring + * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring + * @param is_sp + * Indicates whether to use single producer or multi-producer head update + * @param free_space + * returns the amount of space after the enqueue operation has finished + * @return + * Actual number of objects enqueued. + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. + */ +static __rte_always_inline unsigned int +__rte_ring_do_enqueue_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, + enum rte_ring_queue_behavior behavior, unsigned int is_sp, + unsigned int *free_space) +{ + uint32_t prod_head, prod_next; + uint32_t free_entries; + + n = __rte_ring_move_prod_head(r, is_sp, n, behavior, + &prod_head, &prod_next, &free_entries); + if (n == 0) + goto end; + + ENQUEUE_PTRS_ELEM(r, &r[1], prod_head, obj_table, esize, n); + + update_tail(&r->prod, prod_head, prod_next, is_sp, 1); +end: + if (free_space != NULL) + *free_space = free_entries - n; + return n; +} + +/** + * @internal Dequeue several objects from the ring + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to pull from the ring. + * @param behavior + * RTE_RING_QUEUE_FIXED: Dequeue a fixed number of items from a ring + * RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring + * @param is_sc + * Indicates whether to use single consumer or multi-consumer head update + * @param available + * returns the number of remaining ring entries after the dequeue has finished + * @return + * - Actual number of objects dequeued. + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. + */ +static __rte_always_inline unsigned int +__rte_ring_do_dequeue_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, + enum rte_ring_queue_behavior behavior, unsigned int is_sc, + unsigned int *available) +{ + uint32_t cons_head, cons_next; + uint32_t entries; + + n = __rte_ring_move_cons_head(r, (int)is_sc, n, behavior, + &cons_head, &cons_next, &entries); + if (n == 0) + goto end; + + DEQUEUE_PTRS_ELEM(r, &r[1], cons_head, obj_table, esize, n); + + update_tail(&r->cons, cons_head, cons_next, is_sc, 0); + +end: + if (available != NULL) + *available = entries - n; + return n; +} + +/** + * Enqueue several objects on the ring (multi-producers safe). + * + * This function uses a "compare and set" instruction to move the + * producer index atomically. + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param free_space + * if non-NULL, returns the amount of space in the ring after the + * enqueue operation has finished. + * @return + * The number of objects enqueued, either 0 or n + */ +static __rte_always_inline unsigned int +rte_ring_mp_enqueue_bulk_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, unsigned int *free_space) +{ + return __rte_ring_do_enqueue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_FIXED, __IS_MP, free_space); +} + +/** + * Enqueue several objects on a ring (NOT multi-producers safe). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param free_space + * if non-NULL, returns the amount of space in the ring after the + * enqueue operation has finished. + * @return + * The number of objects enqueued, either 0 or n + */ +static __rte_always_inline unsigned int +rte_ring_sp_enqueue_bulk_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, unsigned int *free_space) +{ + return __rte_ring_do_enqueue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_FIXED, __IS_SP, free_space); +} + +/** + * Enqueue several objects on a ring. + * + * This function calls the multi-producer or the single-producer + * version depending on the default behavior that was specified at + * ring creation time (see flags). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param free_space + * if non-NULL, returns the amount of space in the ring after the + * enqueue operation has finished. + * @return + * The number of objects enqueued, either 0 or n + */ +static __rte_always_inline unsigned int +rte_ring_enqueue_bulk_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, unsigned int *free_space) +{ + return __rte_ring_do_enqueue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_FIXED, r->prod.single, free_space); +} + +/** + * Enqueue one object on a ring (multi-producers safe). + * + * This function uses a "compare and set" instruction to move the + * producer index atomically. + * + * @param r + * A pointer to the ring structure. + * @param obj + * A pointer to the object to be added. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @return + * - 0: Success; objects enqueued. + * - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued. + */ +static __rte_always_inline int +rte_ring_mp_enqueue_elem(struct rte_ring *r, void *obj, unsigned int esize) +{ + return rte_ring_mp_enqueue_bulk_elem(r, obj, esize, 1, NULL) ? 0 : + -ENOBUFS; +} + +/** + * Enqueue one object on a ring (NOT multi-producers safe). + * + * @param r + * A pointer to the ring structure. + * @param obj + * A pointer to the object to be added. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @return + * - 0: Success; objects enqueued. + * - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued. + */ +static __rte_always_inline int +rte_ring_sp_enqueue_elem(struct rte_ring *r, void *obj, unsigned int esize) +{ + return rte_ring_sp_enqueue_bulk_elem(r, obj, esize, 1, NULL) ? 0 : + -ENOBUFS; +} + +/** + * Enqueue one object on a ring. + * + * This function calls the multi-producer or the single-producer + * version, depending on the default behaviour that was specified at + * ring creation time (see flags). + * + * @param r + * A pointer to the ring structure. + * @param obj + * A pointer to the object to be added. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @return + * - 0: Success; objects enqueued. + * - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued. + */ +static __rte_always_inline int +rte_ring_enqueue_elem(struct rte_ring *r, void *obj, unsigned int esize) +{ + return rte_ring_enqueue_bulk_elem(r, obj, esize, 1, NULL) ? 0 : + -ENOBUFS; +} + +/** + * Dequeue several objects from a ring (multi-consumers safe). + * + * This function uses a "compare and set" instruction to move the + * consumer index atomically. + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to dequeue from the ring to the obj_table. + * @param available + * If non-NULL, returns the number of remaining ring entries after the + * dequeue has finished. + * @return + * The number of objects dequeued, either 0 or n + */ +static __rte_always_inline unsigned int +rte_ring_mc_dequeue_bulk_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, unsigned int *available) +{ + return __rte_ring_do_dequeue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_FIXED, __IS_MC, available); +} + +/** + * Dequeue several objects from a ring (NOT multi-consumers safe). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to dequeue from the ring to the obj_table, + * must be strictly positive. + * @param available + * If non-NULL, returns the number of remaining ring entries after the + * dequeue has finished. + * @return + * The number of objects dequeued, either 0 or n + */ +static __rte_always_inline unsigned int +rte_ring_sc_dequeue_bulk_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, unsigned int *available) +{ + return __rte_ring_do_dequeue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_FIXED, __IS_SC, available); +} + +/** + * Dequeue several objects from a ring. + * + * This function calls the multi-consumers or the single-consumer + * version, depending on the default behaviour that was specified at + * ring creation time (see flags). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to dequeue from the ring to the obj_table. + * @param available + * If non-NULL, returns the number of remaining ring entries after the + * dequeue has finished. + * @return + * The number of objects dequeued, either 0 or n + */ +static __rte_always_inline unsigned int +rte_ring_dequeue_bulk_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, unsigned int *available) +{ + return __rte_ring_do_dequeue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_FIXED, r->cons.single, available); +} + +/** + * Dequeue one object from a ring (multi-consumers safe). + * + * This function uses a "compare and set" instruction to move the + * consumer index atomically. + * + * @param r + * A pointer to the ring structure. + * @param obj_p + * A pointer to a void * pointer (object) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @return + * - 0: Success; objects dequeued. + * - -ENOENT: Not enough entries in the ring to dequeue; no object is + * dequeued. + */ +static __rte_always_inline int +rte_ring_mc_dequeue_elem(struct rte_ring *r, void *obj_p, + unsigned int esize) +{ + return rte_ring_mc_dequeue_bulk_elem(r, obj_p, esize, 1, NULL) ? 0 : + -ENOENT; +} + +/** + * Dequeue one object from a ring (NOT multi-consumers safe). + * + * @param r + * A pointer to the ring structure. + * @param obj_p + * A pointer to a void * pointer (object) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @return + * - 0: Success; objects dequeued. + * - -ENOENT: Not enough entries in the ring to dequeue, no object is + * dequeued. + */ +static __rte_always_inline int +rte_ring_sc_dequeue_elem(struct rte_ring *r, void *obj_p, + unsigned int esize) +{ + return rte_ring_sc_dequeue_bulk_elem(r, obj_p, esize, 1, NULL) ? 0 : + -ENOENT; +} + +/** + * Dequeue one object from a ring. + * + * This function calls the multi-consumers or the single-consumer + * version depending on the default behaviour that was specified at + * ring creation time (see flags). + * + * @param r + * A pointer to the ring structure. + * @param obj_p + * A pointer to a void * pointer (object) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @return + * - 0: Success, objects dequeued. + * - -ENOENT: Not enough entries in the ring to dequeue, no object is + * dequeued. + */ +static __rte_always_inline int +rte_ring_dequeue_elem(struct rte_ring *r, void *obj_p, unsigned int esize) +{ + return rte_ring_dequeue_bulk_elem(r, obj_p, esize, 1, NULL) ? 0 : + -ENOENT; +} + +/** + * Enqueue several objects on the ring (multi-producers safe). + * + * This function uses a "compare and set" instruction to move the + * producer index atomically. + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param free_space + * if non-NULL, returns the amount of space in the ring after the + * enqueue operation has finished. + * @return + * - n: Actual number of objects enqueued. + */ +static __rte_always_inline unsigned +rte_ring_mp_enqueue_burst_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, unsigned int *free_space) +{ + return __rte_ring_do_enqueue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_VARIABLE, __IS_MP, free_space); +} + +/** + * Enqueue several objects on a ring (NOT multi-producers safe). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param free_space + * if non-NULL, returns the amount of space in the ring after the + * enqueue operation has finished. + * @return + * - n: Actual number of objects enqueued. + */ +static __rte_always_inline unsigned +rte_ring_sp_enqueue_burst_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, unsigned int *free_space) +{ + return __rte_ring_do_enqueue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_VARIABLE, __IS_SP, free_space); +} + +/** + * Enqueue several objects on a ring. + * + * This function calls the multi-producer or the single-producer + * version depending on the default behavior that was specified at + * ring creation time (see flags). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to add in the ring from the obj_table. + * @param free_space + * if non-NULL, returns the amount of space in the ring after the + * enqueue operation has finished. + * @return + * - n: Actual number of objects enqueued. + */ +static __rte_always_inline unsigned +rte_ring_enqueue_burst_elem(struct rte_ring *r, void * const obj_table, + unsigned int esize, unsigned int n, unsigned int *free_space) +{ + return __rte_ring_do_enqueue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_VARIABLE, r->prod.single, free_space); +} + +/** + * Dequeue several objects from a ring (multi-consumers safe). When the request + * objects are more than the available objects, only dequeue the actual number + * of objects + * + * This function uses a "compare and set" instruction to move the + * consumer index atomically. + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to dequeue from the ring to the obj_table. + * @param available + * If non-NULL, returns the number of remaining ring entries after the + * dequeue has finished. + * @return + * - n: Actual number of objects dequeued, 0 if ring is empty + */ +static __rte_always_inline unsigned +rte_ring_mc_dequeue_burst_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, unsigned int *available) +{ + return __rte_ring_do_dequeue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_VARIABLE, __IS_MC, available); +} + +/** + * Dequeue several objects from a ring (NOT multi-consumers safe).When the + * request objects are more than the available objects, only dequeue the + * actual number of objects + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to dequeue from the ring to the obj_table. + * @param available + * If non-NULL, returns the number of remaining ring entries after the + * dequeue has finished. + * @return + * - n: Actual number of objects dequeued, 0 if ring is empty + */ +static __rte_always_inline unsigned +rte_ring_sc_dequeue_burst_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, unsigned int *available) +{ + return __rte_ring_do_dequeue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_VARIABLE, __IS_SC, available); +} + +/** + * Dequeue multiple objects from a ring up to a maximum number. + * + * This function calls the multi-consumers or the single-consumer + * version, depending on the default behaviour that was specified at + * ring creation time (see flags). + * + * @param r + * A pointer to the ring structure. + * @param obj_table + * A pointer to a table of void * pointers (objects) that will be filled. + * @param esize + * The size of ring element, in bytes. It must be a multiple of 4. + * Currently, sizes 4, 8 and 16 are supported. This should be the same + * as passed while creating the ring, otherwise the results are undefined. + * @param n + * The number of objects to dequeue from the ring to the obj_table. + * @param available + * If non-NULL, returns the number of remaining ring entries after the + * dequeue has finished. + * @return + * - Number of objects dequeued + */ +static __rte_always_inline unsigned +rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table, + unsigned int esize, unsigned int n, unsigned int *available) +{ + return __rte_ring_do_dequeue_elem(r, obj_table, esize, n, + RTE_RING_QUEUE_VARIABLE, + r->cons.single, available); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_RING_ELEM_H_ */ diff --git a/lib/librte_ring/rte_ring_version.map b/lib/librte_ring/rte_ring_version.map index 510c1386e..e410a7503 100644 --- a/lib/librte_ring/rte_ring_version.map +++ b/lib/librte_ring/rte_ring_version.map @@ -21,6 +21,8 @@ DPDK_2.2 { EXPERIMENTAL { global: + rte_ring_create_elem; + rte_ring_get_memsize_elem; rte_ring_reset; }; From patchwork Mon Oct 21 00:22:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 177009 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2677084ill; Sun, 20 Oct 2019 17:24:08 -0700 (PDT) X-Google-Smtp-Source: APXvYqwgzT9jXT4AI5WE7OsWapNZzZPDm71uOotKuob9Lk33ZEpsLYDuO0uLxE9ZpuGUPUTdMiv4 X-Received: by 2002:a17:906:790:: with SMTP id l16mr20156665ejc.270.1571617447895; Sun, 20 Oct 2019 17:24:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617447; cv=none; d=google.com; s=arc-20160816; b=sAzS7gqIqUNwtmpPYLzOGPEEpDbI0zAwlpTXCpInkYLMByu4YgyyN9v6KVrDuXKXYC 4ikQp3v9eSJI2CLVNc2WLm89vVml1PD++dt004liZghHeV/zuWI1j/jbgi6+PaQ9HOWJ R8V3unHI5xUEf4icpnkE9nHxhtjKf+oXdnyd//7ci3S1nwpgtPKXRhRznxLIU3FwkSLK 7DSRRF25saP9L/GM71aRwNB1Qks32KuDcVaWIjnq/U0La34KEYzpQeJFE5y/SXdR6kSa yZ1iFdzQfsFlkgycaPueSyuz2n2Lgx3u7UKtfULwuMUr5sg50w17q+JE8oLoIwgZTStR jYLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=nuGDm6WXTo/Ce4gxPz7gDesXHbd3L2A/v4XUAxi/6+M=; b=yvAW04ZZI2QgdG/l+9Yd2M0cQFNDO0jXkQWJHkdgj6nB9wdBBjoaLNqeLhupxhh5Ic 1A7hGRyonZnV4n6DPMLY7D9tBO3J80NOTMdEZDgnew7Ao2xK5hJsLQ6FNpFIsjc+ewly 5me1u2DuReSYuyDmsJll7zJhmJ5DnnMmxdGaTWz+nx0N2Fczwd9zJ3XKCSybJH487V01 1AaqFgGYCjONnDSU3++6UCRL/e8ysh0tgoQIDnZKW+iSmN+ZWH24ThrG6AwCmUp8CuNu lZ/sdspRHf7e7+yaUtEXldZINwflIuiYYbBxezzzhCf9HQP72qNkleXkWd5Gu1+BHMm8 oclA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id c38si8930722eda.46.2019.10.20.17.24.07; Sun, 20 Oct 2019 17:24:07 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2B31C2986; Mon, 21 Oct 2019 02:23:47 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 0A1721252 for ; Mon, 21 Oct 2019 02:23:41 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 68930495; Sun, 20 Oct 2019 17:23:33 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4B71B3F71F; Sun, 20 Oct 2019 17:23:33 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:22:57 -0500 Message-Id: <20191021002300.26497-4-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 3/6] test/ring: add functional tests for configurable element size ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add functional tests for rte_ring_xxx_elem APIs. At this point these are derived mainly from existing rte_ring_xxx test cases. Signed-off-by: Honnappa Nagarahalli --- app/test/Makefile | 1 + app/test/meson.build | 1 + app/test/test_ring_elem.c | 859 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 861 insertions(+) create mode 100644 app/test/test_ring_elem.c -- 2.17.1 diff --git a/app/test/Makefile b/app/test/Makefile index 26ba6fe2b..483865b4a 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -77,6 +77,7 @@ SRCS-y += test_external_mem.c SRCS-y += test_rand_perf.c SRCS-y += test_ring.c +SRCS-y += test_ring_elem.c SRCS-y += test_ring_perf.c SRCS-y += test_pmd_perf.c diff --git a/app/test/meson.build b/app/test/meson.build index ec40943bd..1ca25c00a 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -100,6 +100,7 @@ test_sources = files('commands.c', 'test_red.c', 'test_reorder.c', 'test_ring.c', + 'test_ring_elem.c', 'test_ring_perf.c', 'test_rwlock.c', 'test_sched.c', diff --git a/app/test/test_ring_elem.c b/app/test/test_ring_elem.c new file mode 100644 index 000000000..54ae35a71 --- /dev/null +++ b/app/test/test_ring_elem.c @@ -0,0 +1,859 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2014 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "test.h" + +/* + * Ring + * ==== + * + * #. Basic tests: done on one core: + * + * - Using single producer/single consumer functions: + * + * - Enqueue one object, two objects, MAX_BULK objects + * - Dequeue one object, two objects, MAX_BULK objects + * - Check that dequeued pointers are correct + * + * - Using multi producers/multi consumers functions: + * + * - Enqueue one object, two objects, MAX_BULK objects + * - Dequeue one object, two objects, MAX_BULK objects + * - Check that dequeued pointers are correct + * + * #. Performance tests. + * + * Tests done in test_ring_perf.c + */ + +#define RING_SIZE 4096 +#define MAX_BULK 32 + +static rte_atomic32_t synchro; + +#define TEST_RING_VERIFY(exp) \ + if (!(exp)) { \ + printf("error at %s:%d\tcondition " #exp " failed\n", \ + __func__, __LINE__); \ + rte_ring_dump(stdout, r); \ + return -1; \ + } + +#define TEST_RING_FULL_EMTPY_ITER 8 + +/* + * helper routine for test_ring_basic + */ +static int +test_ring_basic_full_empty(struct rte_ring *r, void * const src, void *dst) +{ + unsigned i, rand; + const unsigned rsz = RING_SIZE - 1; + + printf("Basic full/empty test\n"); + + for (i = 0; TEST_RING_FULL_EMTPY_ITER != i; i++) { + + /* random shift in the ring */ + rand = RTE_MAX(rte_rand() % RING_SIZE, 1UL); + printf("%s: iteration %u, random shift: %u;\n", + __func__, i, rand); + TEST_RING_VERIFY(rte_ring_enqueue_bulk_elem(r, src, 8, rand, + NULL) != 0); + TEST_RING_VERIFY(rte_ring_dequeue_bulk_elem(r, dst, 8, rand, + NULL) == rand); + + /* fill the ring */ + TEST_RING_VERIFY(rte_ring_enqueue_bulk_elem(r, src, 8, rsz, NULL) != 0); + TEST_RING_VERIFY(0 == rte_ring_free_count(r)); + TEST_RING_VERIFY(rsz == rte_ring_count(r)); + TEST_RING_VERIFY(rte_ring_full(r)); + TEST_RING_VERIFY(0 == rte_ring_empty(r)); + + /* empty the ring */ + TEST_RING_VERIFY(rte_ring_dequeue_bulk_elem(r, dst, 8, rsz, + NULL) == rsz); + TEST_RING_VERIFY(rsz == rte_ring_free_count(r)); + TEST_RING_VERIFY(0 == rte_ring_count(r)); + TEST_RING_VERIFY(0 == rte_ring_full(r)); + TEST_RING_VERIFY(rte_ring_empty(r)); + + /* check data */ + TEST_RING_VERIFY(0 == memcmp(src, dst, rsz)); + rte_ring_dump(stdout, r); + } + return 0; +} + +static int +test_ring_basic(struct rte_ring *r) +{ + void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL; + int ret; + unsigned i, num_elems; + + /* alloc dummy object pointers */ + src = malloc(RING_SIZE*2*sizeof(void *)); + if (src == NULL) + goto fail; + + for (i = 0; i < RING_SIZE*2 ; i++) { + src[i] = (void *)(unsigned long)i; + } + cur_src = src; + + /* alloc some room for copied objects */ + dst = malloc(RING_SIZE*2*sizeof(void *)); + if (dst == NULL) + goto fail; + + memset(dst, 0, RING_SIZE*2*sizeof(void *)); + cur_dst = dst; + + printf("enqueue 1 obj\n"); + ret = rte_ring_sp_enqueue_bulk_elem(r, cur_src, 8, 1, NULL); + cur_src += 1; + if (ret == 0) + goto fail; + + printf("enqueue 2 objs\n"); + ret = rte_ring_sp_enqueue_bulk_elem(r, cur_src, 8, 2, NULL); + cur_src += 2; + if (ret == 0) + goto fail; + + printf("enqueue MAX_BULK objs\n"); + ret = rte_ring_sp_enqueue_bulk_elem(r, cur_src, 8, MAX_BULK, NULL); + cur_src += MAX_BULK; + if (ret == 0) + goto fail; + + printf("dequeue 1 obj\n"); + ret = rte_ring_sc_dequeue_bulk_elem(r, cur_dst, 8, 1, NULL); + cur_dst += 1; + if (ret == 0) + goto fail; + + printf("dequeue 2 objs\n"); + ret = rte_ring_sc_dequeue_bulk_elem(r, cur_dst, 8, 2, NULL); + cur_dst += 2; + if (ret == 0) + goto fail; + + printf("dequeue MAX_BULK objs\n"); + ret = rte_ring_sc_dequeue_bulk_elem(r, cur_dst, 8, MAX_BULK, NULL); + cur_dst += MAX_BULK; + if (ret == 0) + goto fail; + + /* check data */ + if (memcmp(src, dst, cur_dst - dst)) { + rte_hexdump(stdout, "src", src, cur_src - src); + rte_hexdump(stdout, "dst", dst, cur_dst - dst); + printf("data after dequeue is not the same\n"); + goto fail; + } + cur_src = src; + cur_dst = dst; + + printf("enqueue 1 obj\n"); + ret = rte_ring_mp_enqueue_bulk_elem(r, cur_src, 8, 1, NULL); + cur_src += 1; + if (ret == 0) + goto fail; + + printf("enqueue 2 objs\n"); + ret = rte_ring_mp_enqueue_bulk_elem(r, cur_src, 8, 2, NULL); + cur_src += 2; + if (ret == 0) + goto fail; + + printf("enqueue MAX_BULK objs\n"); + ret = rte_ring_mp_enqueue_bulk_elem(r, cur_src, 8, MAX_BULK, NULL); + cur_src += MAX_BULK; + if (ret == 0) + goto fail; + + printf("dequeue 1 obj\n"); + ret = rte_ring_mc_dequeue_bulk_elem(r, cur_dst, 8, 1, NULL); + cur_dst += 1; + if (ret == 0) + goto fail; + + printf("dequeue 2 objs\n"); + ret = rte_ring_mc_dequeue_bulk_elem(r, cur_dst, 8, 2, NULL); + cur_dst += 2; + if (ret == 0) + goto fail; + + printf("dequeue MAX_BULK objs\n"); + ret = rte_ring_mc_dequeue_bulk_elem(r, cur_dst, 8, MAX_BULK, NULL); + cur_dst += MAX_BULK; + if (ret == 0) + goto fail; + + /* check data */ + if (memcmp(src, dst, cur_dst - dst)) { + rte_hexdump(stdout, "src", src, cur_src - src); + rte_hexdump(stdout, "dst", dst, cur_dst - dst); + printf("data after dequeue is not the same\n"); + goto fail; + } + cur_src = src; + cur_dst = dst; + + printf("fill and empty the ring\n"); + for (i = 0; i= rte_ring_get_size(exact_sz_ring)) { + printf("%s: error, std ring (size: %u) is not smaller than exact size one (size %u)\n", + __func__, + rte_ring_get_size(std_ring), + rte_ring_get_size(exact_sz_ring)); + goto end; + } + /* + * check that the exact_sz_ring can hold one more element than the + * standard ring. (16 vs 15 elements) + */ + for (i = 0; i < ring_sz - 1; i++) { + rte_ring_enqueue_elem(std_ring, ptr_array, 8); + rte_ring_enqueue_elem(exact_sz_ring, ptr_array, 8); + } + if (rte_ring_enqueue_elem(std_ring, ptr_array, 8) != -ENOBUFS) { + printf("%s: error, unexpected successful enqueue\n", __func__); + goto end; + } + if (rte_ring_enqueue_elem(exact_sz_ring, ptr_array, 8) == -ENOBUFS) { + printf("%s: error, enqueue failed\n", __func__); + goto end; + } + + /* check that dequeue returns the expected number of elements */ + if (rte_ring_dequeue_burst_elem(exact_sz_ring, ptr_array, 8, + RTE_DIM(ptr_array), NULL) != ring_sz) { + printf("%s: error, failed to dequeue expected nb of elements\n", + __func__); + goto end; + } + + /* check that the capacity function returns expected value */ + if (rte_ring_get_capacity(exact_sz_ring) != ring_sz) { + printf("%s: error, incorrect ring capacity reported\n", + __func__); + goto end; + } + + ret = 0; /* all ok if we get here */ +end: + rte_ring_free(std_ring); + rte_ring_free(exact_sz_ring); + return ret; +} + +static int +test_ring(void) +{ + struct rte_ring *r = NULL; + + /* some more basic operations */ + if (test_ring_basic_ex() < 0) + goto test_fail; + + rte_atomic32_init(&synchro); + + r = rte_ring_create_elem("test", RING_SIZE, 8, SOCKET_ID_ANY, 0); + if (r == NULL) + goto test_fail; + + /* retrieve the ring from its name */ + if (rte_ring_lookup("test") != r) { + printf("Cannot lookup ring from its name\n"); + goto test_fail; + } + + /* burst operations */ + if (test_ring_burst_basic(r) < 0) + goto test_fail; + + /* basic operations */ + if (test_ring_basic(r) < 0) + goto test_fail; + + /* basic operations */ + if ( test_create_count_odd() < 0){ + printf("Test failed to detect odd count\n"); + goto test_fail; + } else + printf("Test detected odd count\n"); + + /* test of creating ring with wrong size */ + if (test_ring_creation_with_wrong_size() < 0) + goto test_fail; + + /* test of creation ring with an used name */ + if (test_ring_creation_with_an_used_name() < 0) + goto test_fail; + + if (test_ring_with_exact_size() < 0) + goto test_fail; + + /* dump the ring status */ + rte_ring_list_dump(stdout); + + rte_ring_free(r); + + return 0; + +test_fail: + rte_ring_free(r); + + return -1; +} + +REGISTER_TEST_COMMAND(ring_elem_autotest, test_ring); From patchwork Mon Oct 21 00:22:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 177010 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2677241ill; Sun, 20 Oct 2019 17:24:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqylvsvIjU3oQWXGwJLDGy7GFLJ8I11+6FsFoOKnOsv/EVfqKkZc/d0VRv2gM77vhoSYoKQv X-Received: by 2002:a05:6402:2028:: with SMTP id ay8mr21926272edb.273.1571617457629; Sun, 20 Oct 2019 17:24:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617457; cv=none; d=google.com; s=arc-20160816; b=y7HcHzvnP9CNe7+eO7To9/T5fyjgrz+3ofSnD5QBieozOoU1KgTR0yi3RcDMawwBET 2GJ5PsYCkR9Sxo3VkiTUgYc+ORSfOfUEg5hoaDPnav1QzCadN9YO/4dgDP5uWdoI5lm0 MXPYytqvA4O9mWB+CvKvtlmgTQDUet+3LgQmbyxZ0IqOG9uQXIrPcGtJP/cQkeK0tZsM ij9wWtqnDuPu7DQ8gA5P5jzlrMx/H+9BWPFzXEd8fXVafxbzwg6fZYzHDx0UVjSqmQ7W W10fwWVoqX01Fej5bMSIFiP/S0M3BWm0bCms8mvs1YnRBs5WbkP+bKZatMPbBz6NKr3U Qjcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=RTkZtSulqlJ66igNX+gbojfwoMfPLd4jf1GAJIlLo8k=; b=ZNnEJUYf9jzl4qnDmlCqzMpUm4eHaP7j9i3/QDIGVC6HmzFHs8X/PDSKlKJM+Re2RU rQqMIqa1SqaQ0K6vHQVwCUL9YC9MfxsvSfsX/a+2oGUjujYTbFe4uP1H/1Np5mgAvG6T jvT2fQEnfmGw5Xtf554h3AXyOgXgKL0lvBM8089kpfMFtNtE/38KJtCenYwi1wHFZB0L p0zJBGANPEL54M45dO7O5pENM/5EVPTV3VjAMh8kbnkZTy/QIg9duuEjUAWIuMXn9FDT KGhHQ3gmXqYxDe4hTwpLC8EVDt+hd22LNvFw7hE9vkrIaljLLdPVTFE3LnvfY/fHdOVn mCLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id e6si7492967ejb.348.2019.10.20.17.24.17; Sun, 20 Oct 2019 17:24:17 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4F0622AA6; Mon, 21 Oct 2019 02:23:49 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 7222114EC for ; Mon, 21 Oct 2019 02:23:43 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 418EBE8E; Sun, 20 Oct 2019 17:23:34 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3215C3F71F; Sun, 20 Oct 2019 17:23:34 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:22:58 -0500 Message-Id: <20191021002300.26497-5-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 4/6] test/ring: add perf tests for configurable element size ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add performance tests for rte_ring_xxx_elem APIs. At this point these are derived mainly from existing rte_ring_xxx test cases. Signed-off-by: Honnappa Nagarahalli --- app/test/Makefile | 1 + app/test/meson.build | 1 + app/test/test_ring_perf_elem.c | 419 +++++++++++++++++++++++++++++++++ 3 files changed, 421 insertions(+) create mode 100644 app/test/test_ring_perf_elem.c -- 2.17.1 diff --git a/app/test/Makefile b/app/test/Makefile index 483865b4a..6f168881c 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -79,6 +79,7 @@ SRCS-y += test_rand_perf.c SRCS-y += test_ring.c SRCS-y += test_ring_elem.c SRCS-y += test_ring_perf.c +SRCS-y += test_ring_perf_elem.c SRCS-y += test_pmd_perf.c ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y) diff --git a/app/test/meson.build b/app/test/meson.build index 1ca25c00a..634cbbf26 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -102,6 +102,7 @@ test_sources = files('commands.c', 'test_ring.c', 'test_ring_elem.c', 'test_ring_perf.c', + 'test_ring_perf_elem.c', 'test_rwlock.c', 'test_sched.c', 'test_service_cores.c', diff --git a/app/test/test_ring_perf_elem.c b/app/test/test_ring_perf_elem.c new file mode 100644 index 000000000..402b7877a --- /dev/null +++ b/app/test/test_ring_perf_elem.c @@ -0,0 +1,419 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2014 Intel Corporation + */ + + +#include +#include +#include +#include +#include +#include +#include + +#include "test.h" + +/* + * Ring + * ==== + * + * Measures performance of various operations using rdtsc + * * Empty ring dequeue + * * Enqueue/dequeue of bursts in 1 threads + * * Enqueue/dequeue of bursts in 2 threads + */ + +#define RING_NAME "RING_PERF" +#define RING_SIZE 4096 +#define MAX_BURST 64 + +/* + * the sizes to enqueue and dequeue in testing + * (marked volatile so they won't be seen as compile-time constants) + */ +static const volatile unsigned bulk_sizes[] = { 8, 32 }; + +struct lcore_pair { + unsigned c1, c2; +}; + +static volatile unsigned lcore_count; + +/**** Functions to analyse our core mask to get cores for different tests ***/ + +static int +get_two_hyperthreads(struct lcore_pair *lcp) +{ + unsigned id1, id2; + unsigned c1, c2, s1, s2; + RTE_LCORE_FOREACH(id1) { + /* inner loop just re-reads all id's. We could skip the + * first few elements, but since number of cores is small + * there is little point + */ + RTE_LCORE_FOREACH(id2) { + if (id1 == id2) + continue; + + c1 = rte_lcore_to_cpu_id(id1); + c2 = rte_lcore_to_cpu_id(id2); + s1 = rte_lcore_to_socket_id(id1); + s2 = rte_lcore_to_socket_id(id2); + if ((c1 == c2) && (s1 == s2)) { + lcp->c1 = id1; + lcp->c2 = id2; + return 0; + } + } + } + return 1; +} + +static int +get_two_cores(struct lcore_pair *lcp) +{ + unsigned id1, id2; + unsigned c1, c2, s1, s2; + RTE_LCORE_FOREACH(id1) { + RTE_LCORE_FOREACH(id2) { + if (id1 == id2) + continue; + + c1 = rte_lcore_to_cpu_id(id1); + c2 = rte_lcore_to_cpu_id(id2); + s1 = rte_lcore_to_socket_id(id1); + s2 = rte_lcore_to_socket_id(id2); + if ((c1 != c2) && (s1 == s2)) { + lcp->c1 = id1; + lcp->c2 = id2; + return 0; + } + } + } + return 1; +} + +static int +get_two_sockets(struct lcore_pair *lcp) +{ + unsigned id1, id2; + unsigned s1, s2; + RTE_LCORE_FOREACH(id1) { + RTE_LCORE_FOREACH(id2) { + if (id1 == id2) + continue; + s1 = rte_lcore_to_socket_id(id1); + s2 = rte_lcore_to_socket_id(id2); + if (s1 != s2) { + lcp->c1 = id1; + lcp->c2 = id2; + return 0; + } + } + } + return 1; +} + +/* Get cycle counts for dequeuing from an empty ring. Should be 2 or 3 cycles */ +static void +test_empty_dequeue(struct rte_ring *r) +{ + const unsigned iter_shift = 26; + const unsigned iterations = 1<r; + const unsigned size = params->size; + unsigned i; + uint32_t burst[MAX_BURST] = {0}; + +#ifdef RTE_USE_C11_MEM_MODEL + if (__atomic_add_fetch(&lcore_count, 1, __ATOMIC_RELAXED) != 2) +#else + if (__sync_add_and_fetch(&lcore_count, 1) != 2) +#endif + while (lcore_count != 2) + rte_pause(); + + const uint64_t sp_start = rte_rdtsc(); + for (i = 0; i < iterations; i++) + while (rte_ring_sp_enqueue_bulk_elem(r, burst, 8, size, NULL) + == 0) + rte_pause(); + const uint64_t sp_end = rte_rdtsc(); + + const uint64_t mp_start = rte_rdtsc(); + for (i = 0; i < iterations; i++) + while (rte_ring_mp_enqueue_bulk_elem(r, burst, 8, size, NULL) + == 0) + rte_pause(); + const uint64_t mp_end = rte_rdtsc(); + + params->spsc = ((double)(sp_end - sp_start))/(iterations*size); + params->mpmc = ((double)(mp_end - mp_start))/(iterations*size); + return 0; +} + +/* + * Function that uses rdtsc to measure timing for ring dequeue. Needs pair + * thread running enqueue_bulk function + */ +static int +dequeue_bulk(void *p) +{ + const unsigned iter_shift = 23; + const unsigned iterations = 1<r; + const unsigned size = params->size; + unsigned i; + uint32_t burst[MAX_BURST] = {0}; + +#ifdef RTE_USE_C11_MEM_MODEL + if (__atomic_add_fetch(&lcore_count, 1, __ATOMIC_RELAXED) != 2) +#else + if (__sync_add_and_fetch(&lcore_count, 1) != 2) +#endif + while (lcore_count != 2) + rte_pause(); + + const uint64_t sc_start = rte_rdtsc(); + for (i = 0; i < iterations; i++) + while (rte_ring_sc_dequeue_bulk_elem(r, burst, 8, size, NULL) + == 0) + rte_pause(); + const uint64_t sc_end = rte_rdtsc(); + + const uint64_t mc_start = rte_rdtsc(); + for (i = 0; i < iterations; i++) + while (rte_ring_mc_dequeue_bulk_elem(r, burst, 8, size, NULL) + == 0) + rte_pause(); + const uint64_t mc_end = rte_rdtsc(); + + params->spsc = ((double)(sc_end - sc_start))/(iterations*size); + params->mpmc = ((double)(mc_end - mc_start))/(iterations*size); + return 0; +} + +/* + * Function that calls the enqueue and dequeue bulk functions on pairs of cores. + * used to measure ring perf between hyperthreads, cores and sockets. + */ +static void +run_on_core_pair(struct lcore_pair *cores, struct rte_ring *r, + lcore_function_t f1, lcore_function_t f2) +{ + struct thread_params param1 = {0}, param2 = {0}; + unsigned i; + for (i = 0; i < sizeof(bulk_sizes)/sizeof(bulk_sizes[0]); i++) { + lcore_count = 0; + param1.size = param2.size = bulk_sizes[i]; + param1.r = param2.r = r; + if (cores->c1 == rte_get_master_lcore()) { + rte_eal_remote_launch(f2, ¶m2, cores->c2); + f1(¶m1); + rte_eal_wait_lcore(cores->c2); + } else { + rte_eal_remote_launch(f1, ¶m1, cores->c1); + rte_eal_remote_launch(f2, ¶m2, cores->c2); + rte_eal_wait_lcore(cores->c1); + rte_eal_wait_lcore(cores->c2); + } + printf("SP/SC bulk enq/dequeue (size: %u): %.2F\n", + bulk_sizes[i], param1.spsc + param2.spsc); + printf("MP/MC bulk enq/dequeue (size: %u): %.2F\n", + bulk_sizes[i], param1.mpmc + param2.mpmc); + } +} + +/* + * Test function that determines how long an enqueue + dequeue of a single item + * takes on a single lcore. Result is for comparison with the bulk enq+deq. + */ +static void +test_single_enqueue_dequeue(struct rte_ring *r) +{ + const unsigned iter_shift = 24; + const unsigned iterations = 1< X-Patchwork-Id: 177011 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2677407ill; Sun, 20 Oct 2019 17:24:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqy9izUjhghMI1lw9pApusqYFEmI10N+zGdLrFBdZoQLmjG77oJtyVMlzwYhCNp3GRFX4FYA X-Received: by 2002:aa7:d908:: with SMTP id a8mr22315611edr.49.1571617470714; Sun, 20 Oct 2019 17:24:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617470; cv=none; d=google.com; s=arc-20160816; b=v2lE9swNPnI5xl4lscV5UDdwcdKwdJoOg8LHx6ozZGn39CGOeT6QbQaOPZPRHsT/tB XYhGFJn8PgRmSRsdQUx+/5cJBK/Bx2rBOdEiGPJ5TqUTk891byox5Bm9T+BaRKAy8oFk 15HMlhi4SXLncsUl/n1Lf9xZtjr01iQc+cvN9lxsSGVrYpgyYMHt1bV/cQnHDVeN0X3F 5JRqtoA+XFZsEHY8RU3qQ33AXUdngT9aaCzoJbEad1aExHP4kPSh6eCUW1pCfy4TjhgP 4uMac+IjFOpq0dRuDXpZmrWuCnhfh3wgAq5IsBp6XKSMdeV4KaO2RAuHrbYV+Bbfne97 zfng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=27Ytt/X4RbHu4Tw2YX06MU2mQLtAlIc3l2RRPLARZgs=; b=x4ffIh7R/UC+qyigs2Vqnt76CuN/QBA2aqA/GPi89OkiAX0dCYYMwEgPiLnkrSDZiJ hmneyCeah5yhseb0EPlKcD9n6eaP5moxpeNF5LBmi0G+eABXIAPISiUfjTHvp95fxjHT xmn+qykCPcR0UFbAcc4Xh4FGOUBn4B9CD+cI/vTWf2/6Gje/NTaGXaIV8qU5bILdRt/7 OubV1RU6jC/zNzSAyh9tGkg7Ck7qAsP/Mt0AXGwgQA3vq+1AQH9m9QfbPG+aX17pI1Mq qHzkPYvRL2nV6oO7eIDJRBg7EwoKUeWb4dSchRoVzrvdSjN2yTDo3lUvWrKrNj7Lujmb KsRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id x12si7797905ejs.203.2019.10.20.17.24.30; Sun, 20 Oct 2019 17:24:30 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9D81E2BCE; Mon, 21 Oct 2019 02:23:52 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 6C14D2951 for ; Mon, 21 Oct 2019 02:23:46 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0D8E11007; Sun, 20 Oct 2019 17:23:35 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EAB603F71F; Sun, 20 Oct 2019 17:23:34 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:22:59 -0500 Message-Id: <20191021002300.26497-6-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 5/6] lib/ring: copy ring elements using memcpy partially X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Copy of ring elements uses memcpy for 32B chunks. The remaining bytes are copied using assignments. Signed-off-by: Honnappa Nagarahalli --- lib/librte_ring/rte_ring.c | 10 -- lib/librte_ring/rte_ring_elem.h | 229 +++++++------------------------- 2 files changed, 49 insertions(+), 190 deletions(-) -- 2.17.1 diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c index e95285259..0f7f4b598 100644 --- a/lib/librte_ring/rte_ring.c +++ b/lib/librte_ring/rte_ring.c @@ -51,16 +51,6 @@ rte_ring_get_memsize_elem(unsigned count, unsigned esize) { ssize_t sz; - /* Supported esize values are 4/8/16. - * Others can be added on need basis. - */ - if (esize != 4 && esize != 8 && esize != 16) { - RTE_LOG(ERR, RING, - "Unsupported esize value. Supported values are 4, 8 and 16\n"); - - return -EINVAL; - } - /* count must be a power of 2 */ if ((!POWEROF2(count)) || (count > RTE_RING_SZ_MASK )) { RTE_LOG(ERR, RING, diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h index 7e9914567..0ce5f2be7 100644 --- a/lib/librte_ring/rte_ring_elem.h +++ b/lib/librte_ring/rte_ring_elem.h @@ -24,6 +24,7 @@ extern "C" { #include #include #include +#include #include #include #include @@ -108,215 +109,83 @@ __rte_experimental struct rte_ring *rte_ring_create_elem(const char *name, unsigned int count, unsigned int esize, int socket_id, unsigned int flags); -/* the actual enqueue of pointers on the ring. - * Placed here since identical code needed in both - * single and multi producer enqueue functions. - */ -#define ENQUEUE_PTRS_ELEM(r, ring_start, prod_head, obj_table, esize, n) do { \ - if (esize == 4) \ - ENQUEUE_PTRS_32(r, ring_start, prod_head, obj_table, n); \ - else if (esize == 8) \ - ENQUEUE_PTRS_64(r, ring_start, prod_head, obj_table, n); \ - else if (esize == 16) \ - ENQUEUE_PTRS_128(r, ring_start, prod_head, obj_table, n); \ -} while (0) - -#define ENQUEUE_PTRS_32(r, ring_start, prod_head, obj_table, n) do { \ - unsigned int i; \ +#define ENQUEUE_PTRS_GEN(r, ring_start, prod_head, obj_table, esize, n) do { \ + unsigned int i, j; \ const uint32_t size = (r)->size; \ uint32_t idx = prod_head & (r)->mask; \ uint32_t *ring = (uint32_t *)ring_start; \ uint32_t *obj = (uint32_t *)obj_table; \ - if (likely(idx + n < size)) { \ - for (i = 0; i < (n & ((~(uint32_t)0x7))); i += 8, idx += 8) { \ - ring[idx] = obj[i]; \ - ring[idx + 1] = obj[i + 1]; \ - ring[idx + 2] = obj[i + 2]; \ - ring[idx + 3] = obj[i + 3]; \ - ring[idx + 4] = obj[i + 4]; \ - ring[idx + 5] = obj[i + 5]; \ - ring[idx + 6] = obj[i + 6]; \ - ring[idx + 7] = obj[i + 7]; \ + uint32_t nr_n = n * (esize / sizeof(uint32_t)); \ + uint32_t nr_idx = idx * (esize / sizeof(uint32_t)); \ + uint32_t seg0 = size - idx; \ + if (likely(n < seg0)) { \ + for (i = 0; i < (nr_n & ((~(unsigned)0x7))); \ + i += 8, nr_idx += 8) { \ + memcpy(ring + nr_idx, obj + i, 8 * sizeof (uint32_t)); \ } \ - switch (n & 0x7) { \ + switch (nr_n & 0x7) { \ case 7: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ case 6: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ case 5: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ case 4: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ case 3: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ case 2: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ - case 1: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ - } \ - } else { \ - for (i = 0; idx < size; i++, idx++)\ - ring[idx] = obj[i]; \ - for (idx = 0; i < n; i++, idx++) \ - ring[idx] = obj[i]; \ - } \ -} while (0) - -#define ENQUEUE_PTRS_64(r, ring_start, prod_head, obj_table, n) do { \ - unsigned int i; \ - const uint32_t size = (r)->size; \ - uint32_t idx = prod_head & (r)->mask; \ - uint64_t *ring = (uint64_t *)ring_start; \ - uint64_t *obj = (uint64_t *)obj_table; \ - if (likely(idx + n < size)) { \ - for (i = 0; i < (n & ((~(uint32_t)0x3))); i += 4, idx += 4) { \ - ring[idx] = obj[i]; \ - ring[idx + 1] = obj[i + 1]; \ - ring[idx + 2] = obj[i + 2]; \ - ring[idx + 3] = obj[i + 3]; \ - } \ - switch (n & 0x3) { \ - case 3: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ - case 2: \ - ring[idx++] = obj[i++]; /* fallthrough */ \ - case 1: \ - ring[idx++] = obj[i++]; \ - } \ - } else { \ - for (i = 0; idx < size; i++, idx++)\ - ring[idx] = obj[i]; \ - for (idx = 0; i < n; i++, idx++) \ - ring[idx] = obj[i]; \ - } \ -} while (0) - -#define ENQUEUE_PTRS_128(r, ring_start, prod_head, obj_table, n) do { \ - unsigned int i; \ - const uint32_t size = (r)->size; \ - uint32_t idx = prod_head & (r)->mask; \ - __uint128_t *ring = (__uint128_t *)ring_start; \ - __uint128_t *obj = (__uint128_t *)obj_table; \ - if (likely(idx + n < size)) { \ - for (i = 0; i < (n >> 1); i += 2, idx += 2) { \ - ring[idx] = obj[i]; \ - ring[idx + 1] = obj[i + 1]; \ - } \ - switch (n & 0x1) { \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ case 1: \ - ring[idx++] = obj[i++]; \ + ring[nr_idx++] = obj[i++]; /* fallthrough */ \ } \ } else { \ - for (i = 0; idx < size; i++, idx++)\ - ring[idx] = obj[i]; \ - for (idx = 0; i < n; i++, idx++) \ - ring[idx] = obj[i]; \ + uint32_t nr_seg0 = seg0 * (esize / sizeof(uint32_t)); \ + uint32_t nr_seg1 = nr_n - nr_seg0; \ + for (i = 0; i < nr_seg0; i++, nr_idx++)\ + ring[nr_idx] = obj[i]; \ + for (j = 0; j < nr_seg1; i++, j++) \ + ring[j] = obj[i]; \ } \ } while (0) -/* the actual copy of pointers on the ring to obj_table. - * Placed here since identical code needed in both - * single and multi consumer dequeue functions. - */ -#define DEQUEUE_PTRS_ELEM(r, ring_start, cons_head, obj_table, esize, n) do { \ - if (esize == 4) \ - DEQUEUE_PTRS_32(r, ring_start, cons_head, obj_table, n); \ - else if (esize == 8) \ - DEQUEUE_PTRS_64(r, ring_start, cons_head, obj_table, n); \ - else if (esize == 16) \ - DEQUEUE_PTRS_128(r, ring_start, cons_head, obj_table, n); \ -} while (0) - -#define DEQUEUE_PTRS_32(r, ring_start, cons_head, obj_table, n) do { \ - unsigned int i; \ +#define DEQUEUE_PTRS_GEN(r, ring_start, cons_head, obj_table, esize, n) do { \ + unsigned int i, j; \ uint32_t idx = cons_head & (r)->mask; \ const uint32_t size = (r)->size; \ uint32_t *ring = (uint32_t *)ring_start; \ uint32_t *obj = (uint32_t *)obj_table; \ - if (likely(idx + n < size)) { \ - for (i = 0; i < (n & (~(uint32_t)0x7)); i += 8, idx += 8) {\ - obj[i] = ring[idx]; \ - obj[i + 1] = ring[idx + 1]; \ - obj[i + 2] = ring[idx + 2]; \ - obj[i + 3] = ring[idx + 3]; \ - obj[i + 4] = ring[idx + 4]; \ - obj[i + 5] = ring[idx + 5]; \ - obj[i + 6] = ring[idx + 6]; \ - obj[i + 7] = ring[idx + 7]; \ + uint32_t nr_n = n * (esize / sizeof(uint32_t)); \ + uint32_t nr_idx = idx * (esize / sizeof(uint32_t)); \ + uint32_t seg0 = size - idx; \ + if (likely(n < seg0)) { \ + for (i = 0; i < (nr_n & ((~(unsigned)0x7))); \ + i += 8, nr_idx += 8) { \ + memcpy(obj + i, ring + nr_idx, 8 * sizeof (uint32_t)); \ } \ - switch (n & 0x7) { \ + switch (nr_n & 0x7) { \ case 7: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ case 6: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ case 5: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ case 4: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ case 3: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ case 2: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ - case 1: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ - } \ - } else { \ - for (i = 0; idx < size; i++, idx++) \ - obj[i] = ring[idx]; \ - for (idx = 0; i < n; i++, idx++) \ - obj[i] = ring[idx]; \ - } \ -} while (0) - -#define DEQUEUE_PTRS_64(r, ring_start, cons_head, obj_table, n) do { \ - unsigned int i; \ - uint32_t idx = cons_head & (r)->mask; \ - const uint32_t size = (r)->size; \ - uint64_t *ring = (uint64_t *)ring_start; \ - uint64_t *obj = (uint64_t *)obj_table; \ - if (likely(idx + n < size)) { \ - for (i = 0; i < (n & (~(uint32_t)0x3)); i += 4, idx += 4) {\ - obj[i] = ring[idx]; \ - obj[i + 1] = ring[idx + 1]; \ - obj[i + 2] = ring[idx + 2]; \ - obj[i + 3] = ring[idx + 3]; \ - } \ - switch (n & 0x3) { \ - case 3: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ - case 2: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ - case 1: \ - obj[i++] = ring[idx++]; \ - } \ - } else { \ - for (i = 0; idx < size; i++, idx++) \ - obj[i] = ring[idx]; \ - for (idx = 0; i < n; i++, idx++) \ - obj[i] = ring[idx]; \ - } \ -} while (0) - -#define DEQUEUE_PTRS_128(r, ring_start, cons_head, obj_table, n) do { \ - unsigned int i; \ - uint32_t idx = cons_head & (r)->mask; \ - const uint32_t size = (r)->size; \ - __uint128_t *ring = (__uint128_t *)ring_start; \ - __uint128_t *obj = (__uint128_t *)obj_table; \ - if (likely(idx + n < size)) { \ - for (i = 0; i < (n >> 1); i += 2, idx += 2) { \ - obj[i] = ring[idx]; \ - obj[i + 1] = ring[idx + 1]; \ - } \ - switch (n & 0x1) { \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ case 1: \ - obj[i++] = ring[idx++]; /* fallthrough */ \ + obj[i++] = ring[nr_idx++]; /* fallthrough */ \ } \ } else { \ - for (i = 0; idx < size; i++, idx++) \ - obj[i] = ring[idx]; \ - for (idx = 0; i < n; i++, idx++) \ - obj[i] = ring[idx]; \ + uint32_t nr_seg0 = seg0 * (esize / sizeof(uint32_t)); \ + uint32_t nr_seg1 = nr_n - nr_seg0; \ + for (i = 0; i < nr_seg0; i++, nr_idx++)\ + obj[i] = ring[nr_idx];\ + for (j = 0; j < nr_seg1; i++, j++) \ + obj[i] = ring[j]; \ } \ } while (0) @@ -373,7 +242,7 @@ __rte_ring_do_enqueue_elem(struct rte_ring *r, void * const obj_table, if (n == 0) goto end; - ENQUEUE_PTRS_ELEM(r, &r[1], prod_head, obj_table, esize, n); + ENQUEUE_PTRS_GEN(r, &r[1], prod_head, obj_table, esize, n); update_tail(&r->prod, prod_head, prod_next, is_sp, 1); end: @@ -420,7 +289,7 @@ __rte_ring_do_dequeue_elem(struct rte_ring *r, void *obj_table, if (n == 0) goto end; - DEQUEUE_PTRS_ELEM(r, &r[1], cons_head, obj_table, esize, n); + DEQUEUE_PTRS_GEN(r, &r[1], cons_head, obj_table, esize, n); update_tail(&r->cons, cons_head, cons_next, is_sc, 0); From patchwork Mon Oct 21 00:23:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 177012 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2677525ill; Sun, 20 Oct 2019 17:24:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqysZIdyxBNdPlBgP1jAXl/FeCylzvk+GapwGKBOd5zxNq99vJy975e/YaRQrDQGnPlu1lf+ X-Received: by 2002:aa7:d04a:: with SMTP id n10mr22573896edo.14.1571617478333; Sun, 20 Oct 2019 17:24:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617478; cv=none; d=google.com; s=arc-20160816; b=a0jmPgS7r0rB8O0/rrv1Irh1Pw8UR+JMmS2tr2cXjVD8G/I1DxY5r7bta/mXQb+hKF xhsYB6/e24+LgWmN70T4bAKC2c/pQGZPh27cEI7vrEVNPqNeQGJw4dHwp/IM9ETxXMlZ IzxR3AVoItWCSLYW+TlFkv7nR3MNZMXXVNB8Gjhhu8R6A+fpQDYkmlC7F7Y62W1OYnHk s+Pymv7eMrGJudGdRXwP+gRj5FzJoDrxTRFBbbjelyV2TuRYNWiLYBeEn5Zb71scwf8e aEVN8tAVm3MuukNKgNfdtrucKFXSe1RE0CEWYlZt9b9z/rqBXlLcmCrNTs6L7fUqszSN 7pOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=r8ZMVrwCbpKhw32AwcaMyXgQyJTlku5DO6tdQCuUcfE=; b=0fcjedGUCnrVuOTwPicJO2akxx3NvVyuaTYtY0sOmfok//+QyJbm1CALq61uTqr4r2 330jSqF0VzSID0QPqhwfGBZK3s3kuVoRA/+4p+38XPCjShM7siRvErp59wcRYtnE8exb fNG1MppDkfQZ1VlmdDhi2SF8nykzkxGRFOmXlCEoMumeHEcNMf9lzv/Ib7FUCXnzW3jk dT4eS8sSvkeTZYrXSkP/pVvhYn/1l13cEPwroXA79hOgMFtwhtslEAzJPPcpj40pKr89 ANqrGI2/OlURo736XImYR2eAIb+7t0jVk+UNVaVX0GFu9vExbJJ9x3laLLowcAwq3FPq uvQg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id r19si8351002edy.409.2019.10.20.17.24.38; Sun, 20 Oct 2019 17:24:38 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 45DE72BEA; Mon, 21 Oct 2019 02:23:54 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 8FFAC29D2 for ; Mon, 21 Oct 2019 02:23:47 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DD904101E; Sun, 20 Oct 2019 17:23:35 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C7ED53F71F; Sun, 20 Oct 2019 17:23:35 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:23:00 -0500 Message-Id: <20191021002300.26497-7-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 6/6] lib/ring: improved copy function to copy ring elements X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Improved copy function to copy to/from ring elements. Signed-off-by: Honnappa Nagarahalli Signed-off-by: Konstantin Ananyev --- lib/librte_ring/rte_ring_elem.h | 165 ++++++++++++++++---------------- 1 file changed, 84 insertions(+), 81 deletions(-) -- 2.17.1 diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h index 0ce5f2be7..80ec3c562 100644 --- a/lib/librte_ring/rte_ring_elem.h +++ b/lib/librte_ring/rte_ring_elem.h @@ -109,85 +109,88 @@ __rte_experimental struct rte_ring *rte_ring_create_elem(const char *name, unsigned int count, unsigned int esize, int socket_id, unsigned int flags); -#define ENQUEUE_PTRS_GEN(r, ring_start, prod_head, obj_table, esize, n) do { \ - unsigned int i, j; \ - const uint32_t size = (r)->size; \ - uint32_t idx = prod_head & (r)->mask; \ - uint32_t *ring = (uint32_t *)ring_start; \ - uint32_t *obj = (uint32_t *)obj_table; \ - uint32_t nr_n = n * (esize / sizeof(uint32_t)); \ - uint32_t nr_idx = idx * (esize / sizeof(uint32_t)); \ - uint32_t seg0 = size - idx; \ - if (likely(n < seg0)) { \ - for (i = 0; i < (nr_n & ((~(unsigned)0x7))); \ - i += 8, nr_idx += 8) { \ - memcpy(ring + nr_idx, obj + i, 8 * sizeof (uint32_t)); \ - } \ - switch (nr_n & 0x7) { \ - case 7: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 6: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 5: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 4: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 3: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 2: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 1: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - } \ - } else { \ - uint32_t nr_seg0 = seg0 * (esize / sizeof(uint32_t)); \ - uint32_t nr_seg1 = nr_n - nr_seg0; \ - for (i = 0; i < nr_seg0; i++, nr_idx++)\ - ring[nr_idx] = obj[i]; \ - for (j = 0; j < nr_seg1; i++, j++) \ - ring[j] = obj[i]; \ - } \ -} while (0) - -#define DEQUEUE_PTRS_GEN(r, ring_start, cons_head, obj_table, esize, n) do { \ - unsigned int i, j; \ - uint32_t idx = cons_head & (r)->mask; \ - const uint32_t size = (r)->size; \ - uint32_t *ring = (uint32_t *)ring_start; \ - uint32_t *obj = (uint32_t *)obj_table; \ - uint32_t nr_n = n * (esize / sizeof(uint32_t)); \ - uint32_t nr_idx = idx * (esize / sizeof(uint32_t)); \ - uint32_t seg0 = size - idx; \ - if (likely(n < seg0)) { \ - for (i = 0; i < (nr_n & ((~(unsigned)0x7))); \ - i += 8, nr_idx += 8) { \ - memcpy(obj + i, ring + nr_idx, 8 * sizeof (uint32_t)); \ - } \ - switch (nr_n & 0x7) { \ - case 7: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 6: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 5: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 4: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 3: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 2: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 1: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - } \ - } else { \ - uint32_t nr_seg0 = seg0 * (esize / sizeof(uint32_t)); \ - uint32_t nr_seg1 = nr_n - nr_seg0; \ - for (i = 0; i < nr_seg0; i++, nr_idx++)\ - obj[i] = ring[nr_idx];\ - for (j = 0; j < nr_seg1; i++, j++) \ - obj[i] = ring[j]; \ - } \ -} while (0) +static __rte_always_inline void +copy_elems(uint32_t du32[], const uint32_t su32[], uint32_t nr_num) +{ + uint32_t i; + + for (i = 0; i < (nr_num & ~7); i += 8) + memcpy(du32 + i, su32 + i, 8 * sizeof(uint32_t)); + + switch (nr_num & 7) { + case 7: du32[nr_num - 7] = su32[nr_num - 7]; /* fallthrough */ + case 6: du32[nr_num - 6] = su32[nr_num - 6]; /* fallthrough */ + case 5: du32[nr_num - 5] = su32[nr_num - 5]; /* fallthrough */ + case 4: du32[nr_num - 4] = su32[nr_num - 4]; /* fallthrough */ + case 3: du32[nr_num - 3] = su32[nr_num - 3]; /* fallthrough */ + case 2: du32[nr_num - 2] = su32[nr_num - 2]; /* fallthrough */ + case 1: du32[nr_num - 1] = su32[nr_num - 1]; /* fallthrough */ + } +} + +static __rte_always_inline void +enqueue_elems(struct rte_ring *r, void *ring_start, uint32_t prod_head, + void *obj_table, uint32_t num, uint32_t esize) +{ + uint32_t idx, nr_idx, nr_num; + uint32_t *du32; + const uint32_t *su32; + + const uint32_t size = r->size; + uint32_t s0, nr_s0, nr_s1; + + idx = prod_head & (r)->mask; + /* Normalize the idx to uint32_t */ + nr_idx = (idx * esize) / sizeof(uint32_t); + + du32 = (uint32_t *)ring_start + nr_idx; + su32 = obj_table; + + /* Normalize the number of elements to uint32_t */ + nr_num = (num * esize) / sizeof(uint32_t); + + s0 = size - idx; + if (num < s0) + copy_elems(du32, su32, nr_num); + else { + nr_s0 = (s0 * esize) / sizeof(uint32_t); + nr_s1 = nr_num - nr_s0; + copy_elems(du32, su32, nr_s0); + copy_elems(ring_start, su32 + nr_s0, nr_s1); + } +} + +static __rte_always_inline void +dequeue_elems(struct rte_ring *r, void *ring_start, uint32_t cons_head, + void *obj_table, uint32_t num, uint32_t esize) +{ + uint32_t idx, nr_idx, nr_num; + uint32_t *du32; + const uint32_t *su32; + + const uint32_t size = r->size; + uint32_t s0, nr_s0, nr_s1; + + idx = cons_head & (r)->mask; + /* Normalize the idx to uint32_t */ + nr_idx = (idx * esize) / sizeof(uint32_t); + + su32 = (uint32_t *)ring_start + nr_idx; + du32 = obj_table; + + /* Normalize the number of elements to uint32_t */ + nr_num = (num * esize) / sizeof(uint32_t); + + s0 = size - idx; + if (num < s0) + copy_elems(du32, su32, nr_num); + else { + nr_s0 = (s0 * esize) / sizeof(uint32_t); + nr_s1 = nr_num - nr_s0; + copy_elems(du32, su32, nr_s0); + copy_elems(du32 + nr_s0, ring_start, nr_s1); + } +} /* Between load and load. there might be cpu reorder in weak model * (powerpc/arm). @@ -242,7 +245,7 @@ __rte_ring_do_enqueue_elem(struct rte_ring *r, void * const obj_table, if (n == 0) goto end; - ENQUEUE_PTRS_GEN(r, &r[1], prod_head, obj_table, esize, n); + enqueue_elems(r, &r[1], prod_head, obj_table, n, esize); update_tail(&r->prod, prod_head, prod_next, is_sp, 1); end: @@ -289,7 +292,7 @@ __rte_ring_do_dequeue_elem(struct rte_ring *r, void *obj_table, if (n == 0) goto end; - DEQUEUE_PTRS_GEN(r, &r[1], cons_head, obj_table, esize, n); + dequeue_elems(r, &r[1], cons_head, obj_table, n, esize); update_tail(&r->cons, cons_head, cons_next, is_sc, 0);