From patchwork Fri Sep 6 09:45:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 173206 Delivered-To: patch@linaro.org Received: by 2002:a05:6e02:ce:0:0:0:0 with SMTP id r14csp478894ilq; Fri, 6 Sep 2019 02:47:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqzdm55H+wsWv4YyGWBEe38b7rBesFeJST8Dn2bqFUy4u+C8VpMEB7lNQLQfyE4hWQZKZXBm X-Received: by 2002:aa7:c50e:: with SMTP id o14mr8533185edq.78.1567763220199; Fri, 06 Sep 2019 02:47:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567763220; cv=none; d=google.com; s=arc-20160816; b=CCoQ9DLsTVH+nofmWCqIptl7xKE7gx2GG+EuyDJijoFoKSTr+0jKGh8APpyPCRYrOk OfwBE2TqbBBEcqzqpV20oFtCWNhj1Wbl7EMvGGJTKqRM8l9mEZT0Tx0opoaUI6pFKmG4 eOm0+2ePsWoUiqsBOEfBlyph3ZFyDj6V4BpdfQi5krqouZP0cR7vFdhI86v8btRDlu2H CCO/4HEZZ2zaYfhjf/igwEdbpxFoEZSAuo4NQnsQN+nqx1RbRMnFrvnG30vl39hObhjh VxAwsZyGp+SQnfU5OEsTh29D2tUAPZg8cTecCf0MUy9er0QzhlMLEgvfoS+69yVdjgxz yyog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=z5VgBz5igu2nYbk6C8qS7njnHvCbwRE0047rKZvM6vw=; b=oAC+LKPhC+fW3LieW/9taaoeE+UDiC4+wEMzYsNSYMMZ+xA5BimvROHLFMMV3Au90O qJWisRsv1O1go+pMImU25abhN4a7yoM/YxbmR5+/zfpwuYfQoS9G2N6kViTwKpgzC5i5 gpp/yv5zcHzsvO7Tp5ROLCTV/1w1UvKyjJd/BTdsX3GVsA3oghLPZk2tDYrNcmb/NYnC xz+QEKtSgimKcyrjozB8b1RYLTO2BaNsHtp0JgH+txFzF5vEfkDTAOb2BwXgcVpR97s3 J2lEyKCYH+LGrhl7924pJDHIaKGSzpD6m+U4v1nn47W0kRzJakTELxmyj2kAy+Wguga8 griA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id rv26si2589331ejb.354.2019.09.06.02.46.59; Fri, 06 Sep 2019 02:47:00 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BF6FE1F2EE; Fri, 6 Sep 2019 11:46:30 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 65E831F2E9 for ; Fri, 6 Sep 2019 11:46:28 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C27B71570; Fri, 6 Sep 2019 02:46:27 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 87E713F59C; Fri, 6 Sep 2019 02:46:25 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, stephen@networkplumber.org, konstantin.ananyev@intel.com, gavin.hu@arm.com, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com Date: Fri, 6 Sep 2019 17:45:34 +0800 Message-Id: <20190906094534.36060-7-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190906094534.36060-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> <20190906094534.36060-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [PATCH v2 6/6] test/lpm: add RCU integration performance tests X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Honnappa Nagarahalli Add performance tests for RCU integration. The performance difference with and without RCU integration is very small (~1% to ~2%) on both Arm and x86 platforms. Signed-off-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu Reviewed-by: Ruifeng Wang --- app/test/test_lpm_perf.c | 274 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 271 insertions(+), 3 deletions(-) -- 2.17.1 diff --git a/app/test/test_lpm_perf.c b/app/test/test_lpm_perf.c index a2578fe90..475e5d488 100644 --- a/app/test/test_lpm_perf.c +++ b/app/test/test_lpm_perf.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation + * Copyright(c) 2019 Arm Limited */ #include @@ -10,12 +11,23 @@ #include #include #include +#include #include #include +#include #include "test.h" #include "test_xmmt_ops.h" +struct rte_lpm *lpm; +static struct rte_rcu_qsbr *rv; +static volatile uint8_t writer_done; +static volatile uint32_t thr_id; +/* Report quiescent state interval every 8192 lookups. Larger critical + * sections in reader will result in writer polling multiple times. + */ +#define QSBR_REPORTING_INTERVAL 8192 + #define TEST_LPM_ASSERT(cond) do { \ if (!(cond)) { \ printf("Error at line %d: \n", __LINE__); \ @@ -24,6 +36,7 @@ } while(0) #define ITERATIONS (1 << 10) +#define RCU_ITERATIONS 10 #define BATCH_SIZE (1 << 12) #define BULK_SIZE 32 @@ -35,9 +48,13 @@ struct route_rule { }; struct route_rule large_route_table[MAX_RULE_NUM]; +/* Route table for routes with depth > 24 */ +struct route_rule large_ldepth_route_table[MAX_RULE_NUM]; static uint32_t num_route_entries; +static uint32_t num_ldepth_route_entries; #define NUM_ROUTE_ENTRIES num_route_entries +#define NUM_LDEPTH_ROUTE_ENTRIES num_ldepth_route_entries enum { IP_CLASS_A, @@ -191,7 +208,7 @@ static void generate_random_rule_prefix(uint32_t ip_class, uint8_t depth) uint32_t ip_head_mask; uint32_t rule_num; uint32_t k; - struct route_rule *ptr_rule; + struct route_rule *ptr_rule, *ptr_ldepth_rule; if (ip_class == IP_CLASS_A) { /* IP Address class A */ fixed_bit_num = IP_HEAD_BIT_NUM_A; @@ -236,10 +253,20 @@ static void generate_random_rule_prefix(uint32_t ip_class, uint8_t depth) */ start = lrand48() & mask; ptr_rule = &large_route_table[num_route_entries]; + ptr_ldepth_rule = &large_ldepth_route_table[num_ldepth_route_entries]; for (k = 0; k < rule_num; k++) { ptr_rule->ip = (start << (RTE_LPM_MAX_DEPTH - depth)) | ip_head_mask; ptr_rule->depth = depth; + /* If the depth of the route is more than 24, store it + * in another table as well. + */ + if (depth > 24) { + ptr_ldepth_rule->ip = ptr_rule->ip; + ptr_ldepth_rule->depth = ptr_rule->depth; + ptr_ldepth_rule++; + num_ldepth_route_entries++; + } ptr_rule++; start = (start + step) & mask; } @@ -273,6 +300,7 @@ static void generate_large_route_rule_table(void) uint8_t depth; num_route_entries = 0; + num_ldepth_route_entries = 0; memset(large_route_table, 0, sizeof(large_route_table)); for (ip_class = IP_CLASS_A; ip_class <= IP_CLASS_C; ip_class++) { @@ -316,10 +344,248 @@ print_route_distribution(const struct route_rule *table, uint32_t n) printf("\n"); } +/* Check condition and return an error if true. */ +static uint16_t enabled_core_ids[RTE_MAX_LCORE]; +static unsigned int num_cores; + +/* Simple way to allocate thread ids in 0 to RTE_MAX_LCORE space */ +static inline uint32_t +alloc_thread_id(void) +{ + uint32_t tmp_thr_id; + + tmp_thr_id = __atomic_fetch_add(&thr_id, 1, __ATOMIC_RELAXED); + if (tmp_thr_id >= RTE_MAX_LCORE) + printf("Invalid thread id %u\n", tmp_thr_id); + + return tmp_thr_id; +} + +/* + * Reader thread using rte_lpm data structure without RCU. + */ +static int +test_lpm_reader(__attribute__((unused)) void *arg) +{ + int i; + uint32_t ip_batch[QSBR_REPORTING_INTERVAL]; + uint32_t next_hop_return = 0; + + do { + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + ip_batch[i] = rte_rand(); + + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + rte_lpm_lookup(lpm, ip_batch[i], &next_hop_return); + + } while (!writer_done); + + return 0; +} + +/* + * Reader thread using rte_lpm data structure with RCU. + */ +static int +test_lpm_rcu_qsbr_reader(__attribute__((unused)) void *arg) +{ + int i; + uint32_t thread_id = alloc_thread_id(); + uint32_t ip_batch[QSBR_REPORTING_INTERVAL]; + uint32_t next_hop_return = 0; + + /* Register this thread to report quiescent state */ + rte_rcu_qsbr_thread_register(rv, thread_id); + rte_rcu_qsbr_thread_online(rv, thread_id); + + do { + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + ip_batch[i] = rte_rand(); + + for (i = 0; i < QSBR_REPORTING_INTERVAL; i++) + rte_lpm_lookup(lpm, ip_batch[i], &next_hop_return); + + /* Update quiescent state */ + rte_rcu_qsbr_quiescent(rv, thread_id); + } while (!writer_done); + + rte_rcu_qsbr_thread_offline(rv, thread_id); + rte_rcu_qsbr_thread_unregister(rv, thread_id); + + return 0; +} + +/* + * Functional test: + * Single writer, Single QS variable, Single QSBR query, + * Non-blocking rcu_qsbr_check + */ +static int +test_lpm_rcu_perf(void) +{ + struct rte_lpm_config config; + uint64_t begin, total_cycles; + size_t sz; + unsigned int i, j; + uint16_t core_id; + uint32_t next_hop_add = 0xAA; + + if (rte_lcore_count() < 2) { + printf("Not enough cores for lpm_rcu_perf_autotest, expecting at least 2\n"); + return TEST_SKIPPED; + } + + num_cores = 0; + RTE_LCORE_FOREACH_SLAVE(core_id) { + enabled_core_ids[num_cores] = core_id; + num_cores++; + } + + printf("\nPerf test: 1 writer, %d readers, RCU integration enabled\n", + num_cores); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + /* Init RCU variable */ + sz = rte_rcu_qsbr_get_memsize(num_cores); + rv = (struct rte_rcu_qsbr *)rte_zmalloc("rcu0", sz, + RTE_CACHE_LINE_SIZE); + rte_rcu_qsbr_init(rv, num_cores); + + /* Assign the RCU variable to LPM */ + if (rte_lpm_rcu_qsbr_add(lpm, rv) != 0) { + printf("RCU variable assignment failed\n"); + goto error; + } + + writer_done = 0; + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 0; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_rcu_qsbr_reader, NULL, + enabled_core_ids[i]); + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + goto error; + } + + /* Delete all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + goto error; + } + } + total_cycles = rte_rdtsc_precise() - begin; + + printf("Total LPM Adds: %d\n", ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %g cycles\n", + (double)total_cycles / (NUM_LDEPTH_ROUTE_ENTRIES * ITERATIONS)); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 0; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + goto error; + + rte_lpm_free(lpm); + rte_free(rv); + lpm = NULL; + rv = NULL; + + /* Test without RCU integration */ + printf("\nPerf test: 1 writer, %d readers, RCU integration disabled\n", + num_cores); + + /* Create LPM table */ + config.max_rules = NUM_LDEPTH_ROUTE_ENTRIES; + config.number_tbl8s = NUM_LDEPTH_ROUTE_ENTRIES; + config.flags = 0; + lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); + TEST_LPM_ASSERT(lpm != NULL); + + writer_done = 0; + __atomic_store_n(&thr_id, 0, __ATOMIC_SEQ_CST); + + /* Launch reader threads */ + for (i = 0; i < num_cores; i++) + rte_eal_remote_launch(test_lpm_reader, NULL, + enabled_core_ids[i]); + + /* Measure add/delete. */ + begin = rte_rdtsc_precise(); + for (i = 0; i < RCU_ITERATIONS; i++) { + /* Add all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_add(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth, + next_hop_add) != 0) { + printf("Failed to add iteration %d, route# %d\n", + i, j); + goto error; + } + + /* Delete all the entries */ + for (j = 0; j < NUM_LDEPTH_ROUTE_ENTRIES; j++) + if (rte_lpm_delete(lpm, large_ldepth_route_table[j].ip, + large_ldepth_route_table[j].depth) != 0) { + printf("Failed to delete iteration %d, route# %d\n", + i, j); + goto error; + } + } + total_cycles = rte_rdtsc_precise() - begin; + + printf("Total LPM Adds: %d\n", ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Total LPM Deletes: %d\n", + ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES); + printf("Average LPM Add/Del: %g cycles\n", + (double)total_cycles / (NUM_LDEPTH_ROUTE_ENTRIES * ITERATIONS)); + + writer_done = 1; + /* Wait and check return value from reader threads */ + for (i = 0; i < num_cores; i++) + if (rte_eal_wait_lcore(enabled_core_ids[i]) < 0) + printf("Warning: lcore %u not finished.\n", + enabled_core_ids[i]); + + rte_lpm_free(lpm); + + return 0; + +error: + writer_done = 1; + /* Wait until all readers have exited */ + rte_eal_mp_wait_lcore(); + + rte_lpm_free(lpm); + rte_free(rv); + + return -1; +} + static int test_lpm_perf(void) { - struct rte_lpm *lpm = NULL; struct rte_lpm_config config; config.max_rules = 2000000; @@ -343,7 +609,7 @@ test_lpm_perf(void) lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config); TEST_LPM_ASSERT(lpm != NULL); - /* Measue add. */ + /* Measure add. */ begin = rte_rdtsc(); for (i = 0; i < NUM_ROUTE_ENTRIES; i++) { @@ -478,6 +744,8 @@ test_lpm_perf(void) rte_lpm_delete_all(lpm); rte_lpm_free(lpm); + test_lpm_rcu_perf(); + return 0; }