From patchwork Fri Sep 6 09:45:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 173204 Delivered-To: patch@linaro.org Received: by 2002:a05:6e02:ce:0:0:0:0 with SMTP id r14csp478086ilq; Fri, 6 Sep 2019 02:46:08 -0700 (PDT) X-Google-Smtp-Source: APXvYqzxYWhbj8tkbYgwae/fWbgDcPG1SXpWsawogpnC7SPw/GvHkgU0jYI0gsxQB3/uD4e31y9p X-Received: by 2002:a50:eac5:: with SMTP id u5mr4612612edp.207.1567763168731; Fri, 06 Sep 2019 02:46:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567763168; cv=none; d=google.com; s=arc-20160816; b=GVslE2AoC8nwXM2ygA6EYwx2kYfUX56XXrXtYOXT0LOkJcxdLAPtHvieAg1JZWeFjO ZibIAbsC1HNkxpYjSR1bs+25oPAXAOWlbPqhLpzHRhEgrFsWIx9RRLeHHloUn05jIP/W +35lOwajq/hg60pp+jY5nuzoKQunMmnJvPj2uc5Ph0tW+sM4RfUxO1tvZ1Kcyom8S4py Hgu+pxFptQHqnzWnKlqkMNKOe92eTVC03xzYRNxGqh8e/2n8G2mbPAYjkMtuftHySDml zn2HTI9hxg/9ixkleJ+JBFmShV2jRScpLqdQqPPSxXtnWzgvZFUVHNWlkoGwloJSo86n rdQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=sZCYBbbn7mbCrSRuh91IhSsefm+qAPxh6SCu3e+AVb0=; b=kgTQoQowHRAg7C43HE6nq2hiuEPWxV3OjKGZ3sPRz10Gjpr5rVQ9LmlVZ7wmayknxB GOUecz0hsf/IbQsnlm/F0EJVHx0S6W8wsuEdIVpnKV+w9apPkLCkKXF1M0GJd6tko+tb V+AVCGRLK4YiXAth76rId1brv1mgpbNNcV5w3kW7PAGVCNBjmjweXcd0o9mKHaPm3AwG t5hAgwjhmZ+IY85L2f4tZbd3yZvihHS1X6pbnyRJRS4WE2Gp1Upr+6/So9V9WNmZc3Al evkmEKbm4XOLjKqnimj4cM1hMWtuKZE+LvcDo+cdW17SYaI4pBYFpXq493R7vTYljWa8 eX/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id b25si3338110edd.187.2019.09.06.02.46.08; Fri, 06 Sep 2019 02:46:08 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 48AD51F13D; Fri, 6 Sep 2019 11:46:08 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 83FA61F13D for ; Fri, 6 Sep 2019 11:46:06 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 14A071570; Fri, 6 Sep 2019 02:46:06 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id CA03C3F59C; Fri, 6 Sep 2019 02:46:03 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, stephen@networkplumber.org, konstantin.ananyev@intel.com, gavin.hu@arm.com, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com Date: Fri, 6 Sep 2019 17:45:29 +0800 Message-Id: <20190906094534.36060-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190906094534.36060-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> <20190906094534.36060-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [PATCH v2 1/6] doc/rcu: add RCU integration design details X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Honnappa Nagarahalli Add a section to describe a design to integrate QSBR RCU library with other libraries in DPDK. Signed-off-by: Honnappa Nagarahalli --- doc/guides/prog_guide/rcu_lib.rst | 52 +++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) -- 2.17.1 diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst index 8fe5b1f73..211948530 100644 --- a/doc/guides/prog_guide/rcu_lib.rst +++ b/doc/guides/prog_guide/rcu_lib.rst @@ -186,3 +186,55 @@ However, when ``CONFIG_RTE_LIBRTE_RCU_DEBUG`` is enabled, these APIs aid in debugging issues. One can mark the access to shared data structures on the reader side using these APIs. The ``rte_rcu_qsbr_quiescent()`` will check if all the locks are unlocked. + +Integrating QSBR RCU with other libraries +----------------------------------------- + +Lock-free algorithms place additional burden on the application to reclaim +memory. Integrating memory reclamation mechanisms in the libraries help +remove some of the burden. Though QSBR method presents flexibility to +achieve performance, it presents challenges while integrating with libraries. + +The memory reclamation process using QSBR can be split into 4 parts: + +#. Initialization +#. Quiescent State Reporting +#. Reclaiming Resources +#. Shutdown + +The design proposed here assigns different parts of this process to client libraries and applications. The term 'client library' refers to data structure libraries such at rte_hash, rte_lpm etc. in DPDK or similar libraries outside of DPDK. The term 'application' refers to the packet processing application that makes use of DPDK such as L3 Forwarding example application, OVS, VPP etc.. + +The application has to handle 'Initialization' and 'Quiescent State Reporting'. So, + +* the application has to create the RCU variable and register the reader threads to report their quiescent state. +* the application has to register the same RCU variable with the client library. +* reader threads in the application have to report the quiescent state. This allows for the application to control the length of the critical section/how frequently the application wants to report the quiescent state. + +The client library will handle 'Reclaiming Resources' part of the process. The +client libraries will make use of the writer thread context to execute the memory +reclamation algorithm. So, + +* client library should provide an API to register a RCU variable that it will use. +* client library should trigger the readers to report quiescent state status upon deleting the resources by calling ``rte_rcu_qsbr_start``. + +* client library should store the token and deleted resources for later use to free them after the readers have reported their quiescent state. Since the readers will report the quiescent state status in the order of deletion, the library must store the tokens/resources in the order in which the resources were deleted. A FIFO data structure would achieve the desired results. The length of the FIFO would depend on the rate of deletion and the rate at which the readers report their quiescent state. In the worst case the length of FIFO would be equal to the maximum number of resources the data structure supports. However, in most cases, the length will be much smaller. But, the client library should not take the length of FIFO as an input from the application. Instead, it should implement a data structure which should be able to grow/shrink dynamically. Overhead introduced by such a data structure on delete operations should be considered as well. + +* client library should query the quiescent state and free the resources. It should make use of non-blocking ``rte_rcu_qsbr_check`` API to query the quiescent state. This allows the application to do useful work while the readers report their quiescent state. If there are tokens/resources present in the FIFO already, the delete API should peek the head of the FIFO and check the quiescent state status. If the status is success, the token/resource should be dequeued and the resource should be freed. This process can be repeated till the quiescent state status for a token returns failure indicating that subsequent tokens will also fail quiescent state status query. The same process can be incorporated while adding new entries in the data structure if the client library runs out of resources. + +The 'Shutdown' process needs to be shared between the application and the +client library. + +* the application should make sure that the reader threads are not using the shared data structure, unregister the reader threads from the QSBR variable before calling the client library's shutdown function. + +* client library should check the quiescent state status of all the tokens that may be present in the FIFO and free the resources. It should make use of non-blocking ``rte_rcu_qsbr_check`` API to query the quiescent state. If any of the tokens do not pass the quiescent state check, the client library should print an error and stop the memory reclamation process. + +Integrating the resource reclamation with client libraries removes the burden from +the application and makes it easy to use lock-free algorithms. + +This design has several advantages over currently known methods. + +#. Application does not need a dedicated thread to reclaim resources. Memory + reclamation happens as part of the writer thread with little impact on + performance. +#. The client library has better control over the resources. For ex: the client + library can attempt to reclaim when it has run out of resources. From patchwork Mon Jun 8 05:16:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 187588 Delivered-To: patch@linaro.org Received: by 2002:a92:cf06:0:0:0:0:0 with SMTP id c6csp4551384ilo; Sun, 7 Jun 2020 22:18:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyjdSFgKRfl7bZVrOMCDCoRg7SJCAMCtA1dB3TkBhUIWDwVLxOGUFavq2yCkr5U5WDM1OFw X-Received: by 2002:a1c:e0c3:: with SMTP id x186mr14310815wmg.17.1591593492970; Sun, 07 Jun 2020 22:18:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591593492; cv=none; d=google.com; s=arc-20160816; b=qVwJtRNg4M5xqWy6V+mt9klQ/u2SYi+SB+Gwf5eFWUj3Qfavw0vkrnq4E0c3SWXF8M xHGEQyCnA3uKZ0aUmDneEv2buyJL2OONCRBRqXvfDkxrhIWx0qlclfRgf7IbbQCCx174 tDcsv97MeThTwPTk1oC2OWJ5K4jF5cl+2t90JH1dad+UAJGgWlr12NRHTFbSfyoPs8wp qQUfqHkZ7W5VpXdnSsTe9xvuB03anSfmoHc7dvNd/Y3hwS5yxp1MwxUFNknhkudvqLt/ cpojAk7cltIPDG4X2bf1Zixccd9USnbxZMVgzIOB39fB0vEQTQZtID7Ol2KnQDrLzcKB Dq8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=mvN1Ce7Xkzq9MdbJeid49jiLPmHASnSM55RTh+3T5uU=; b=yy/hA0eNKKF/J6upHwC0pv767PGOO/xv/zPhZtTfD+Aq02K2Pe8TWgviZryWRTekmy ntp2srE77UMP0WjHpLpaziPxJqLDjY4KlqI5WzCT70H6EZ/GfYeEvEU21jR77hFvLXr8 e4qT9zCKIUR3jnxXa3LrqQKWmdQw4z/XCXnhfzQu3dAljFepbNg7N1OddSxRBxPn0t6U vPKVBHO08PNnM5I5ceY4ToseVgTsU5J/4rI87aKFrLnB35zg1CIJ7FtfJzt1eGv1g2kj ygqlhl4hKTf9BbBd1ekUiu+Pf6600hVz0PwW5jup4p1rbMk5F19wZujC5QvzpS24KVJY MM7Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id qc4si7840020ejb.369.2020.06.07.22.18.12; Sun, 07 Jun 2020 22:18:12 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 41ED81BEB1; Mon, 8 Jun 2020 07:18:09 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id D24381BEA9 for ; Mon, 8 Jun 2020 07:18:07 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4C2F731B; Sun, 7 Jun 2020 22:18:07 -0700 (PDT) Received: from net-arm-thunderx2-02.shanghai.arm.com (net-arm-thunderx2-02.shanghai.arm.com [10.169.41.165]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E56503F52E; Sun, 7 Jun 2020 22:18:04 -0700 (PDT) From: Ruifeng Wang To: Bruce Richardson , Vladimir Medvedkin Cc: dev@dpdk.org, konstantin.ananyev@intel.com, honnappa.nagarahalli@arm.com, nd@arm.com Date: Mon, 8 Jun 2020 13:16:57 +0800 Message-Id: <20200608051658.144417-4-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200608051658.144417-1-ruifeng.wang@arm.com> References: <20190906094534.36060-1-ruifeng.wang@arm.com> <20200608051658.144417-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [PATCH v4 3/3] test/lpm: add RCU integration performance tests X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Honnappa Nagarahalli Add performance tests for RCU integration. The performance difference with and without RCU integration is very small (~1% to ~2%) on both Arm and x86 platforms. Signed-off-by: Honnappa Nagarahalli Reviewed-by: Ruifeng Wang --- app/test/test_lpm_perf.c | 492 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 489 insertions(+), 3 deletions(-) -- 2.17.1 diff --git a/app/test/test_lpm_perf.c b/app/test/test_lpm_perf.c index 489719c40..dfe186426 100644 --- a/app/test/test_lpm_perf.c +++ b/app/test/test_lpm_perf.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation + * Copyright(c) 2020 Arm Limited */ #include @@ -10,12 +11,27 @@ #include #include #include +#include #include #include #include "test.h" #include "test_xmmt_ops.h" +struct rte_lpm *lpm; +static struct rte_rcu_qsbr *rv; +static volatile uint8_t writer_done; +static volatile uint32_t thr_id; +static uint64_t gwrite_cycles; +static uint64_t gwrites; +/* LPM APIs are not thread safe, use mutex to provide thread safety */ +static pthread_mutex_t lpm_mutex = PTHREAD_MUTEX_INITIALIZER; + +/* Report quiescent state interval every 1024 lookups. Larger critical + * sections in reader will result in writer polling multiple times. + */ +#define QSBR_REPORTING_INTERVAL 1024 + #define TEST_LPM_ASSERT(cond) do { \ if (!(cond)) { \ printf("Error at line %d: \n", __LINE__); \ @@ -24,6 +40,7 @@ } while(0) #define ITERATIONS (1 << 10) +#define RCU_ITERATIONS 10 #define BATCH_SIZE (1 << 12) #define BULK_SIZE 32 @@ -35,9 +52,13 @@ struct route_rule { }; static struct route_rule large_route_table[MAX_RULE_NUM]; +/* Route table for routes with depth > 24 */ +struct route_rule large_ldepth_route_table[MAX_RULE_NUM]; static uint32_t num_route_entries; +static uint32_t num_ldepth_route_entries; #define NUM_ROUTE_ENTRIES num_route_entries +#define NUM_LDEPTH_ROUTE_ENTRIES num_ldepth_route_entries enum { IP_CLASS_A, @@ -191,7 +212,7 @@ static void generate_random_rule_prefix(uint32_t ip_class, uint8_t depth) uint32_t ip_head_mask; uint32_t rule_num; uint32_t k; - struct route_rule *ptr_rule; + struct route_rule *ptr_rule, *ptr_ldepth_rule; if (ip_class == IP_CLASS_A) { /* IP Address class A */ fixed_bit_num = IP_HEAD_BIT_NUM_A; @@ -236,10 +257,20 @@ static void generate_random_rule_prefix(uint32_t ip_class, uint8_t depth) */ start = lrand48() & mask; ptr_rule = &large_route_table[num_route_entries]; + ptr_ldepth_rule = &large_ldepth_route_table[num_ldepth_route_entries]; for (k = 0; k < rule_num; k++) { ptr_rule->ip = (start << (RTE_LPM_MAX_DEPTH - depth)) | ip_head_mask; ptr_rule->depth = depth; + /* If the depth of the route is more than 24, store it + * in another table as well. + */ + if (depth > 24) { + ptr_ldepth_rule->ip = ptr_rule->ip; + ptr_ldepth_rule->depth = ptr_rule->depth; + ptr_ldepth_rule++; + num_ldepth_route_entries++; + } ptr_rule++; start = (start + step) & mask; } @@ -273,6 +304,7 @@ static void generate_large_route_rule_table(void) uint8_t depth; num_route_entries = 0; + num_ldepth_route_entries = 0; memset(large_route_table, 0, sizeof(large_route_table)); for (ip_class = IP_CLASS_A; ip_class <= IP_CLASS_C; ip_class++) { @@ -316,10 +348,460 @@ print_route_distribution(const struct route_rule *table, uint32_t n) printf("\n"); } +/* Check condition and return an error if true. */ +static uint16_t enabled_core_ids[RTE_MAX_LCORE]; +static unsigned int num_cores; + +/* Simple way to allocate thread ids in 0 to RTE_MAX_LCORE space */ +static inline uint32_t +alloc_thread_id(void) +{ + uint32_t tmp_thr_id; + + tmp_thr_id = __atomic_fetch_add(&thr_id, 1, __ATOMIC_RELAXED); + if (tmp_thr_id >= RTE_MAX_LCORE) + printf("Invalid thread id %u\n", tmp_thr_id); + + return tmp_thr_id; +} + +/* + * Reader thread using rte_lpm data structure without RCU. + */ +static int +test_lpm_reader(void *arg) +{ + int i; + uint32_t ip_batch[QSBR_REPORTING_INTERVAL]; + uint32_t next_hop_return = 0; + + RTE_SET_USED(arg); + do { + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + ip_batch[i] = rte_rand(); + + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + rte_lpm_lookup(lpm, ip_batch[i], &next_hop_return); + + } while (!writer_done); + + return 0; +} + +/* + * Reader thread using rte_lpm data structure with RCU. + */ +static int +test_lpm_rcu_qsbr_reader(void *arg) +{ + int i; + uint32_t thread_id = alloc_thread_id(); + uint32_t ip_batch[QSBR_REPORTING_INTERVAL]; + uint32_t next_hop_return = 0; + + RTE_SET_USED(arg); + /* Register this thread to report quiescent state */ + rte_rcu_qsbr_thread_register(rv, thread_id); + rte_rcu_qsbr_thread_online(rv, thread_id); + + do { + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + ip_batch[i] = rte_rand(); + + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + rte_lpm_lookup(lpm, ip_batch[i], &next_hop_return); + + /* Update quiescent state */ + rte_rcu_qsbr_quiescent(rv, thread_id); + } while (!writer_done); + + rte_rcu_qsbr_thread_offline(rv, thread_id); + rte_rcu_qsbr_thread_unregister(rv, thread_id); + + return 0; +} + +/* + * Writer thread using rte_lpm data structure with RCU. + */ +static int +test_lpm_rcu_qsbr_writer(void *arg) +{ + unsigned int i, j, si, ei; + uint64_t begin, total_cycles; + uint8_t core_id = (uint8_t)((uintptr_t)arg); + uint32_t next_hop_add = 0xAA; + + RTE_SET_USED(arg); + /* 2 writer threads are used */ + if (core_id % 2 == 0) { + si = 0; + ei = NUM_LDEPTH_ROUTE_ENTRIES / 2; + } else { + si = NUM_LDEPTH_ROUTE_ENTRIES / 2; + ei = NUM_LDEPTH_ROUTE_ENTRIES; + } + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = si; j < ei; j++) { + pthread_mutex_lock(&lpm_mutex); + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + } + pthread_mutex_unlock(&lpm_mutex); + } + + /* Delete all the entries */ + for (j = si; j < ei; j++) { + pthread_mutex_lock(&lpm_mutex); + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + } + pthread_mutex_unlock(&lpm_mutex); + } + } + + total_cycles = rte_rdtsc_precise() - begin; + + __atomic_fetch_add(&gwrite_cycles, total_cycles, __ATOMIC_RELAXED); + __atomic_fetch_add(&gwrites, + 2 * NUM_LDEPTH_ROUTE_ENTRIES * RCU_ITERATIONS, + __ATOMIC_RELAXED); + + return 0; +} + +/* + * Functional test: + * 2 writers, rest are readers + */ +static int +test_lpm_rcu_perf_multi_writer(void) +{ + struct rte_lpm_config config; + size_t sz; + unsigned int i; + uint16_t core_id; + struct rte_lpm_rcu_config rcu_cfg = {0}; + + if (rte_lcore_count() < 3) { + printf("Not enough cores for lpm_rcu_perf_autotest, expecting at least 3\n"); + return TEST_SKIPPED; + } + + num_cores = 0; + RTE_LCORE_FOREACH_SLAVE(core_id) { + enabled_core_ids[num_cores] = core_id; + num_cores++; + } + + printf("\nPerf test: 2 writers, %d readers, RCU integration enabled\n", + num_cores - 2); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + /* Init RCU variable */ + sz = rte_rcu_qsbr_get_memsize(num_cores); + rv = (struct rte_rcu_qsbr *)rte_zmalloc("rcu0", sz, + RTE_CACHE_LINE_SIZE); + rte_rcu_qsbr_init(rv, num_cores); + + rcu_cfg.v = rv; + /* Assign the RCU variable to LPM */ + if (rte_lpm_rcu_qsbr_add(lpm, &rcu_cfg, NULL) != 0) { + printf("RCU variable assignment failed\n"); + goto error; + } + + writer_done = 0; + __atomic_store_n(&gwrite_cycles, 0, __ATOMIC_RELAXED); + __atomic_store_n(&gwrites, 0, __ATOMIC_RELAXED); + + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 2; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_rcu_qsbr_reader, NULL, + enabled_core_ids[i]); + + /* Launch writer threads */ + for (i = 0; i < 2; i++) + rte_eal_remote_launch(test_lpm_rcu_qsbr_writer, + (void *)(uintptr_t)i, + enabled_core_ids[i]); + + /* Wait for writer threads */ + for (i = 0; i < 2; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + printf("Total LPM Adds: %d\n", + 2 * ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + 2 * ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %"PRIu64" cycles\n", + __atomic_load_n(&gwrite_cycles, __ATOMIC_RELAXED) / + __atomic_load_n(&gwrites, __ATOMIC_RELAXED) + ); + + /* Wait and check return value from reader threads */ + writer_done = 1; + for (i = 2; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + rte_lpm_free(lpm); + rte_free(rv); + lpm = NULL; + rv = NULL; + + /* Test without RCU integration */ + printf("\nPerf test: 2 writers, %d readers, RCU integration disabled\n", + num_cores - 2); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + writer_done = 0; + __atomic_store_n(&gwrite_cycles, 0, __ATOMIC_RELAXED); + __atomic_store_n(&gwrites, 0, __ATOMIC_RELAXED); + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 2; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_reader, NULL, + enabled_core_ids[i]); + + /* Launch writer threads */ + for (i = 0; i < 2; i++) + rte_eal_remote_launch(test_lpm_rcu_qsbr_writer, + (void *)(uintptr_t)i, + enabled_core_ids[i]); + + /* Wait for writer threads */ + for (i = 0; i < 2; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + printf("Total LPM Adds: %d\n", + 2 * ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + 2 * ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %"PRIu64" cycles\n", + __atomic_load_n(&gwrite_cycles, __ATOMIC_RELAXED) / + __atomic_load_n(&gwrites, __ATOMIC_RELAXED) + ); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 2; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + rte_lpm_free(lpm); + + return 0; + +error: + writer_done = 1; + /* Wait until all readers have exited */ + rte_eal_mp_wait_lcore(); + + rte_lpm_free(lpm); + rte_free(rv); + + return -1; +} + +/* + * Functional test: + * Single writer, rest are readers + */ +static int +test_lpm_rcu_perf(void) +{ + struct rte_lpm_config config; + uint64_t begin, total_cycles; + size_t sz; + unsigned int i, j; + uint16_t core_id; + uint32_t next_hop_add = 0xAA; + struct rte_lpm_rcu_config rcu_cfg = {0}; + + if (rte_lcore_count() < 2) { + printf("Not enough cores for lpm_rcu_perf_autotest, expecting at least 2\n"); + return TEST_SKIPPED; + } + + num_cores = 0; + RTE_LCORE_FOREACH_SLAVE(core_id) { + enabled_core_ids[num_cores] = core_id; + num_cores++; + } + + printf("\nPerf test: 1 writer, %d readers, RCU integration enabled\n", + num_cores); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + /* Init RCU variable */ + sz = rte_rcu_qsbr_get_memsize(num_cores); + rv = (struct rte_rcu_qsbr *)rte_zmalloc("rcu0", sz, + RTE_CACHE_LINE_SIZE); + rte_rcu_qsbr_init(rv, num_cores); + + rcu_cfg.v = rv; + /* Assign the RCU variable to LPM */ + if (rte_lpm_rcu_qsbr_add(lpm, &rcu_cfg, NULL) != 0) { + printf("RCU variable assignment failed\n"); + goto error; + } + + writer_done = 0; + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 0; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_rcu_qsbr_reader, NULL, + enabled_core_ids[i]); + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + goto error; + } + + /* Delete all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + goto error; + } + } + total_cycles = rte_rdtsc_precise() - begin; + + printf("Total LPM Adds: %d\n", ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %g cycles\n", + (double)total_cycles / (NUM_LDEPTH_ROUTE_ENTRIES * ITERATIONS)); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 0; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + rte_lpm_free(lpm); + rte_free(rv); + lpm = NULL; + rv = NULL; + + /* Test without RCU integration */ + printf("\nPerf test: 1 writer, %d readers, RCU integration disabled\n", + num_cores); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + writer_done = 0; + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 0; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_reader, NULL, + enabled_core_ids[i]); + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + goto error; + } + + /* Delete all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + goto error; + } + } + total_cycles = rte_rdtsc_precise() - begin; + + printf("Total LPM Adds: %d\n", ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %g cycles\n", + (double)total_cycles / (NUM_LDEPTH_ROUTE_ENTRIES * ITERATIONS)); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 0; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + printf("Warning: lcore %u not finished.\n", + enabled_core_ids[i]); + + rte_lpm_free(lpm); + + return 0; + +error: + writer_done = 1; + /* Wait until all readers have exited */ + rte_eal_mp_wait_lcore(); + + rte_lpm_free(lpm); + rte_free(rv); + + return -1; +} + static int test_lpm_perf(void) { - struct rte_lpm *lpm = NULL; struct rte_lpm_config config; config.max_rules = 2000000; @@ -343,7 +825,7 @@ test_lpm_perf(void) lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); TEST_LPM_ASSERT(lpm != NULL); - /* Measue add. */ + /* Measure add. */ begin = rte_rdtsc(); for (i = 0; i < NUM_ROUTE_ENTRIES; i++) { @@ -478,6 +960,10 @@ test_lpm_perf(void) rte_lpm_delete_all(lpm); rte_lpm_free(lpm); + test_lpm_rcu_perf(); + + test_lpm_rcu_perf_multi_writer(); + return 0; } From patchwork Fri Sep 6 09:45:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 173205 Delivered-To: patch@linaro.org Received: by 2002:a05:6e02:ce:0:0:0:0 with SMTP id r14csp478698ilq; Fri, 6 Sep 2019 02:46:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqw+43o3o3qwOR75YP+FihNsA20r/bSbNxdXv9KqEtn/ckGkFLGA2rcpuaCusYZeNpIU/oY0 X-Received: by 2002:a50:9e65:: with SMTP id z92mr8427128ede.49.1567763209051; Fri, 06 Sep 2019 02:46:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567763209; cv=none; d=google.com; s=arc-20160816; b=hsd0cWHOxF8PeUzAa7eP5+U9KFdNXQo0NL6BAZfpJvqnlSYi3VtmkLYtjTYSxPZgbg r0Pg6BPyJoLe1d/gBThL5fIDEY8kyQ62t5QKjwKe4X1Gth6olSAmLvHP3tv7liK6cDHs 8fgA5sQWZnMORlNSfYVTlhl30/8P8g9wXqay02m9wuufuzs9Qyqm9kEtCo0bes0oS+08 cmpdk6/dgC5YoHB9ckUBpJu2SuUdpVa6EKqxxwyTQuj2Ka8YZCSb/uKE+7FDHxBYFSJm JwN4uc+tUq/4QUgsKfebyunrKxaGyL5Co5rum+Oxl9I9Tbz7aPq2UV8xq7/gi6B0HQ4Y EOHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=tVHIb8puiB3ZH3YA6i45Io3huayXwIlb3ZA8LpmLSko=; b=p4d8F3bGe8vdTggZZmt3P85llEmAUqPX/FDJMG57jqkVVFizMvOrLKI3Apr8LkNAHu tfDxQ6jiDq0iOdI1gWdamVVq3u2v5w/QNZit86f6Rv4ptyHCmmI56m0L9nBEUDC8aK0G 5wBVdJSCUNlBW75E7cnh8TDJUZu8mcZJn5zqXSyHHzmVq2WA/Rdj3YcURXhYrJ7GnnoT erE5n27OGenwniusl9nJtKq7STAidL88Yok2KRiClRdbdkaWJWKv25+6SIEoYl9EE514 En/9sBGXgGLBW9+x3mdJAoCcXhd8rIvOGskHEd6fqWrJijcBgCQjkN9A/4d0asd7+h5C cMqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id y22si2500255ejp.52.2019.09.06.02.46.48; Fri, 06 Sep 2019 02:46:49 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D2F251F2BC; Fri, 6 Sep 2019 11:46:26 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 8F9BF1F13E; Fri, 6 Sep 2019 11:46:24 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1DC8A1570; Fri, 6 Sep 2019 02:46:24 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9A48D3F59C; Fri, 6 Sep 2019 02:46:21 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, stephen@networkplumber.org, konstantin.ananyev@intel.com, gavin.hu@arm.com, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com, stable@dpdk.org Date: Fri, 6 Sep 2019 17:45:33 +0800 Message-Id: <20190906094534.36060-6-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190906094534.36060-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> <20190906094534.36060-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [PATCH v2 5/6] test/lpm: reset total time X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Honnappa Nagarahalli total_time needs to be reset to measure the cycles for delete API. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu Reviewed-by: Ruifeng Wang --- app/test/test_lpm_perf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 2.17.1 diff --git a/app/test/test_lpm_perf.c b/app/test/test_lpm_perf.c index 77eea66ad..a2578fe90 100644 --- a/app/test/test_lpm_perf.c +++ b/app/test/test_lpm_perf.c @@ -460,7 +460,7 @@ test_lpm_perf(void) (double)total_time / ((double)ITERATIONS * BATCH_SIZE), (count * 100.0) / (double)(ITERATIONS * BATCH_SIZE)); - /* Delete */ + /* Measure Delete */ status = 0; begin = rte_rdtsc(); @@ -470,7 +470,7 @@ test_lpm_perf(void) large_route_table[i].depth); } - total_time += rte_rdtsc() - begin; + total_time = rte_rdtsc() - begin; printf("Average LPM Delete: %g cycles\n", (double)total_time / NUM_ROUTE_ENTRIES); From patchwork Fri Sep 6 09:45:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 173206 Delivered-To: patch@linaro.org Received: by 2002:a05:6e02:ce:0:0:0:0 with SMTP id r14csp478894ilq; Fri, 6 Sep 2019 02:47:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqzdm55H+wsWv4YyGWBEe38b7rBesFeJST8Dn2bqFUy4u+C8VpMEB7lNQLQfyE4hWQZKZXBm X-Received: by 2002:aa7:c50e:: with SMTP id o14mr8533185edq.78.1567763220199; Fri, 06 Sep 2019 02:47:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567763220; cv=none; d=google.com; s=arc-20160816; b=CCoQ9DLsTVH+nofmWCqIptl7xKE7gx2GG+EuyDJijoFoKSTr+0jKGh8APpyPCRYrOk OfwBE2TqbBBEcqzqpV20oFtCWNhj1Wbl7EMvGGJTKqRM8l9mEZT0Tx0opoaUI6pFKmG4 eOm0+2ePsWoUiqsBOEfBlyph3ZFyDj6V4BpdfQi5krqouZP0cR7vFdhI86v8btRDlu2H CCO/4HEZZ2zaYfhjf/igwEdbpxFoEZSAuo4NQnsQN+nqx1RbRMnFrvnG30vl39hObhjh VxAwsZyGp+SQnfU5OEsTh29D2tUAPZg8cTecCf0MUy9er0QzhlMLEgvfoS+69yVdjgxz yyog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=z5VgBz5igu2nYbk6C8qS7njnHvCbwRE0047rKZvM6vw=; b=oAC+LKPhC+fW3LieW/9taaoeE+UDiC4+wEMzYsNSYMMZ+xA5BimvROHLFMMV3Au90O qJWisRsv1O1go+pMImU25abhN4a7yoM/YxbmR5+/zfpwuYfQoS9G2N6kViTwKpgzC5i5 gpp/yv5zcHzsvO7Tp5ROLCTV/1w1UvKyjJd/BTdsX3GVsA3oghLPZk2tDYrNcmb/NYnC xz+QEKtSgimKcyrjozB8b1RYLTO2BaNsHtp0JgH+txFzF5vEfkDTAOb2BwXgcVpR97s3 J2lEyKCYH+LGrhl7924pJDHIaKGSzpD6m+U4v1nn47W0kRzJakTELxmyj2kAy+Wguga8 griA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id rv26si2589331ejb.354.2019.09.06.02.46.59; Fri, 06 Sep 2019 02:47:00 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BF6FE1F2EE; Fri, 6 Sep 2019 11:46:30 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 65E831F2E9 for ; Fri, 6 Sep 2019 11:46:28 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C27B71570; Fri, 6 Sep 2019 02:46:27 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 87E713F59C; Fri, 6 Sep 2019 02:46:25 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, stephen@networkplumber.org, konstantin.ananyev@intel.com, gavin.hu@arm.com, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com Date: Fri, 6 Sep 2019 17:45:34 +0800 Message-Id: <20190906094534.36060-7-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190906094534.36060-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> <20190906094534.36060-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [PATCH v2 6/6] test/lpm: add RCU integration performance tests X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Honnappa Nagarahalli Add performance tests for RCU integration. The performance difference with and without RCU integration is very small (~1% to ~2%) on both Arm and x86 platforms. Signed-off-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu Reviewed-by: Ruifeng Wang --- app/test/test_lpm_perf.c | 274 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 271 insertions(+), 3 deletions(-) -- 2.17.1 diff --git a/app/test/test_lpm_perf.c b/app/test/test_lpm_perf.c index a2578fe90..475e5d488 100644 --- a/app/test/test_lpm_perf.c +++ b/app/test/test_lpm_perf.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation + * Copyright(c) 2019 Arm Limited */ #include @@ -10,12 +11,23 @@ #include #include #include +#include #include #include +#include #include "test.h" #include "test_xmmt_ops.h" +struct rte_lpm *lpm; +static struct rte_rcu_qsbr *rv; +static volatile uint8_t writer_done; +static volatile uint32_t thr_id; +/* Report quiescent state interval every 8192 lookups. Larger critical + * sections in reader will result in writer polling multiple times. + */ +#define QSBR_REPORTING_INTERVAL 8192 + #define TEST_LPM_ASSERT(cond) do { \ if (!(cond)) { \ printf("Error at line %d: \n", __LINE__); \ @@ -24,6 +36,7 @@ } while(0) #define ITERATIONS (1 << 10) +#define RCU_ITERATIONS 10 #define BATCH_SIZE (1 << 12) #define BULK_SIZE 32 @@ -35,9 +48,13 @@ struct route_rule { }; struct route_rule large_route_table[MAX_RULE_NUM]; +/* Route table for routes with depth > 24 */ +struct route_rule large_ldepth_route_table[MAX_RULE_NUM]; static uint32_t num_route_entries; +static uint32_t num_ldepth_route_entries; #define NUM_ROUTE_ENTRIES num_route_entries +#define NUM_LDEPTH_ROUTE_ENTRIES num_ldepth_route_entries enum { IP_CLASS_A, @@ -191,7 +208,7 @@ static void generate_random_rule_prefix(uint32_t ip_class, uint8_t depth) uint32_t ip_head_mask; uint32_t rule_num; uint32_t k; - struct route_rule *ptr_rule; + struct route_rule *ptr_rule, *ptr_ldepth_rule; if (ip_class == IP_CLASS_A) { /* IP Address class A */ fixed_bit_num = IP_HEAD_BIT_NUM_A; @@ -236,10 +253,20 @@ static void generate_random_rule_prefix(uint32_t ip_class, uint8_t depth) */ start = lrand48() & mask; ptr_rule = &large_route_table[num_route_entries]; + ptr_ldepth_rule = &large_ldepth_route_table[num_ldepth_route_entries]; for (k = 0; k < rule_num; k++) { ptr_rule->ip = (start << (RTE_LPM_MAX_DEPTH - depth)) | ip_head_mask; ptr_rule->depth = depth; + /* If the depth of the route is more than 24, store it + * in another table as well. + */ + if (depth > 24) { + ptr_ldepth_rule->ip = ptr_rule->ip; + ptr_ldepth_rule->depth = ptr_rule->depth; + ptr_ldepth_rule++; + num_ldepth_route_entries++; + } ptr_rule++; start = (start + step) & mask; } @@ -273,6 +300,7 @@ static void generate_large_route_rule_table(void) uint8_t depth; num_route_entries = 0; + num_ldepth_route_entries = 0; memset(large_route_table, 0, sizeof(large_route_table)); for (ip_class = IP_CLASS_A; ip_class <= IP_CLASS_C; ip_class++) { @@ -316,10 +344,248 @@ print_route_distribution(const struct route_rule *table, uint32_t n) printf("\n"); } +/* Check condition and return an error if true. */ +static uint16_t enabled_core_ids[RTE_MAX_LCORE]; +static unsigned int num_cores; + +/* Simple way to allocate thread ids in 0 to RTE_MAX_LCORE space */ +static inline uint32_t +alloc_thread_id(void) +{ + uint32_t tmp_thr_id; + + tmp_thr_id = __atomic_fetch_add(&thr_id, 1, __ATOMIC_RELAXED); + if (tmp_thr_id >= RTE_MAX_LCORE) + printf("Invalid thread id %u\n", tmp_thr_id); + + return tmp_thr_id; +} + +/* + * Reader thread using rte_lpm data structure without RCU. + */ +static int +test_lpm_reader(__attribute__((unused)) void *arg) +{ + int i; + uint32_t ip_batch[QSBR_REPORTING_INTERVAL]; + uint32_t next_hop_return = 0; + + do { + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + ip_batch[i] = rte_rand(); + + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + rte_lpm_lookup(lpm, ip_batch[i], &next_hop_return); + + } while (!writer_done); + + return 0; +} + +/* + * Reader thread using rte_lpm data structure with RCU. + */ +static int +test_lpm_rcu_qsbr_reader(__attribute__((unused)) void *arg) +{ + int i; + uint32_t thread_id = alloc_thread_id(); + uint32_t ip_batch[QSBR_REPORTING_INTERVAL]; + uint32_t next_hop_return = 0; + + /* Register this thread to report quiescent state */ + rte_rcu_qsbr_thread_register(rv, thread_id); + rte_rcu_qsbr_thread_online(rv, thread_id); + + do { + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + ip_batch[i] = rte_rand(); + + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + rte_lpm_lookup(lpm, ip_batch[i], &next_hop_return); + + /* Update quiescent state */ + rte_rcu_qsbr_quiescent(rv, thread_id); + } while (!writer_done); + + rte_rcu_qsbr_thread_offline(rv, thread_id); + rte_rcu_qsbr_thread_unregister(rv, thread_id); + + return 0; +} + +/* + * Functional test: + * Single writer, Single QS variable, Single QSBR query, + * Non-blocking rcu_qsbr_check + */ +static int +test_lpm_rcu_perf(void) +{ + struct rte_lpm_config config; + uint64_t begin, total_cycles; + size_t sz; + unsigned int i, j; + uint16_t core_id; + uint32_t next_hop_add = 0xAA; + + if (rte_lcore_count() < 2) { + printf("Not enough cores for lpm_rcu_perf_autotest, expecting at least 2\n"); + return TEST_SKIPPED; + } + + num_cores = 0; + RTE_LCORE_FOREACH_SLAVE(core_id) { + enabled_core_ids[num_cores] = core_id; + num_cores++; + } + + printf("\nPerf test: 1 writer, %d readers, RCU integration enabled\n", + num_cores); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + /* Init RCU variable */ + sz = rte_rcu_qsbr_get_memsize(num_cores); + rv = (struct rte_rcu_qsbr *)rte_zmalloc("rcu0", sz, + RTE_CACHE_LINE_SIZE); + rte_rcu_qsbr_init(rv, num_cores); + + /* Assign the RCU variable to LPM */ + if (rte_lpm_rcu_qsbr_add(lpm, rv) != 0) { + printf("RCU variable assignment failed\n"); + goto error; + } + + writer_done = 0; + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 0; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_rcu_qsbr_reader, NULL, + enabled_core_ids[i]); + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + goto error; + } + + /* Delete all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + goto error; + } + } + total_cycles = rte_rdtsc_precise() - begin; + + printf("Total LPM Adds: %d\n", ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %g cycles\n", + (double)total_cycles / (NUM_LDEPTH_ROUTE_ENTRIES * ITERATIONS)); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 0; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + rte_lpm_free(lpm); + rte_free(rv); + lpm = NULL; + rv = NULL; + + /* Test without RCU integration */ + printf("\nPerf test: 1 writer, %d readers, RCU integration disabled\n", + num_cores); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + writer_done = 0; + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 0; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_reader, NULL, + enabled_core_ids[i]); + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + goto error; + } + + /* Delete all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + goto error; + } + } + total_cycles = rte_rdtsc_precise() - begin; + + printf("Total LPM Adds: %d\n", ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %g cycles\n", + (double)total_cycles / (NUM_LDEPTH_ROUTE_ENTRIES * ITERATIONS)); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 0; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + printf("Warning: lcore %u not finished.\n", + enabled_core_ids[i]); + + rte_lpm_free(lpm); + + return 0; + +error: + writer_done = 1; + /* Wait until all readers have exited */ + rte_eal_mp_wait_lcore(); + + rte_lpm_free(lpm); + rte_free(rv); + + return -1; +} + static int test_lpm_perf(void) { - struct rte_lpm *lpm = NULL; struct rte_lpm_config config; config.max_rules = 2000000; @@ -343,7 +609,7 @@ test_lpm_perf(void) lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); TEST_LPM_ASSERT(lpm != NULL); - /* Measue add. */ + /* Measure add. */ begin = rte_rdtsc(); for (i = 0; i < NUM_ROUTE_ENTRIES; i++) { @@ -478,6 +744,8 @@ test_lpm_perf(void) rte_lpm_delete_all(lpm); rte_lpm_free(lpm); + test_lpm_rcu_perf(); + return 0; }