From patchwork Mon Sep 19 12:05:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shen X-Patchwork-Id: 607428 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFC67ECAAD3 for ; Mon, 19 Sep 2022 12:08:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229722AbiISMIH (ORCPT ); Mon, 19 Sep 2022 08:08:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230028AbiISMIE (ORCPT ); Mon, 19 Sep 2022 08:08:04 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30C0D2A423; Mon, 19 Sep 2022 05:08:02 -0700 (PDT) Received: from dggpemm500022.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MWNd32spCzmVRF; Mon, 19 Sep 2022 20:04:07 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggpemm500022.china.huawei.com (7.185.36.162) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 Received: from localhost.localdomain (10.67.164.66) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 From: Yang Shen To: , CC: , , Subject: [RFC PATCH 1/6] moduleparams: Add hexulong type parameter Date: Mon, 19 Sep 2022 20:05:32 +0800 Message-ID: <20220919120537.39258-2-shenyang39@huawei.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220919120537.39258-1-shenyang39@huawei.com> References: <20220919120537.39258-1-shenyang39@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Due to the bitmap.h uses a unsigned long pointer for bitmap variable, Add an 'hexulong' is more convenient. Signed-off-by: Yang Shen --- include/linux/moduleparam.h | 7 ++++++- kernel/params.c | 1 + 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h index 962cd41a2cb5..9e0828fa3946 100644 --- a/include/linux/moduleparam.h +++ b/include/linux/moduleparam.h @@ -118,7 +118,7 @@ struct kparam_array * you can create your own by defining those variables. * * Standard types are: - * byte, hexint, short, ushort, int, uint, long, ulong + * byte, hexint, hexulong, short, ushort, int, uint, long, ulong * charp: a character pointer * bool: a bool, values 0/1, y/n, Y/N. * invbool: the above, only sense-reversed (N = true). @@ -455,6 +455,11 @@ extern int param_set_hexint(const char *val, const struct kernel_param *kp); extern int param_get_hexint(char *buffer, const struct kernel_param *kp); #define param_check_hexint(name, p) param_check_uint(name, p) +extern const struct kernel_param_ops param_ops_hexulong; +extern int param_set_hexulong(const char *val, const struct kernel_param *kp); +extern int param_get_hexulong(char *buffer, const struct kernel_param *kp); +#define param_check_hexulong(name, p) param_check_ulong(name, p) + extern const struct kernel_param_ops param_ops_charp; extern int param_set_charp(const char *val, const struct kernel_param *kp); extern int param_get_charp(char *buffer, const struct kernel_param *kp); diff --git a/kernel/params.c b/kernel/params.c index 5b92310425c5..f367f0c1f228 100644 --- a/kernel/params.c +++ b/kernel/params.c @@ -242,6 +242,7 @@ STANDARD_PARAM_DEF(long, long, "%li", kstrtol); STANDARD_PARAM_DEF(ulong, unsigned long, "%lu", kstrtoul); STANDARD_PARAM_DEF(ullong, unsigned long long, "%llu", kstrtoull); STANDARD_PARAM_DEF(hexint, unsigned int, "%#08x", kstrtouint); +STANDARD_PARAM_DEF(hexulong, unsigned long, "%#016lx", kstrtoul); int param_set_uint_minmax(const char *val, const struct kernel_param *kp, unsigned int min, unsigned int max) From patchwork Mon Sep 19 12:05:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shen X-Patchwork-Id: 608040 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C618AC54EE9 for ; Mon, 19 Sep 2022 12:08:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230045AbiISMII (ORCPT ); Mon, 19 Sep 2022 08:08:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230062AbiISMIF (ORCPT ); Mon, 19 Sep 2022 08:08:05 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13D8628726; Mon, 19 Sep 2022 05:08:02 -0700 (PDT) Received: from dggpemm500023.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4MWNcq5LY0zlW26; Mon, 19 Sep 2022 20:03:55 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 Received: from localhost.localdomain (10.67.164.66) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 From: Yang Shen To: , CC: , , Subject: [RFC PATCH 2/6] crypto: benchmark - add a crypto benchmark tool Date: Mon, 19 Sep 2022 20:05:33 +0800 Message-ID: <20220919120537.39258-3-shenyang39@huawei.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220919120537.39258-1-shenyang39@huawei.com> References: <20220919120537.39258-1-shenyang39@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Provide a crypto benchmark to help the developer quickly get the performance of a algorithm registered in crypto. Due to the crypto algorithms have multifarious parameters, the tool cannot support all test scenes. In order to provide users with simple and easy-to-use tools and support as many test scenarios as possible, benchmark refers to the crypto method to provide a unified struct 'crypto_bm_ops'. And the algorithm registers its own callbacks to parse the user's input. In crypto, a algorithm class has multiple algorithms, but all of them uses the same API. So in the benchmark, a algorithm class uses the same 'ops' and distinguish specific algorithm by name. First, consider the performance calculation model. Considering the crypto subsystem model, a reasonable process code based on crypto api should create a numa node based 'crypto_tfm' in advance and apply for a certain amount of 'crypto_req' according to their own business. In the real business processing stage, the thread send tasks based on 'crypto_req' and wait for completion. Therefore, the benchmark will create 'crypto_tfm' and 'crypto_req' at first, and then count all requests time to calculate performance. So the result is the pure algorithm performance. When each algorithm class implements its own 'ops', it needs to pay attention to the content completed in the callback. Before the 'ops.perf', the tool had better prepare the request data set. And in order to avoid the false high performance of the algorithm caused by the false cache and TLB hit rate, the size of data set should be larger than 'crypto_req' number. The 'crypto_bm_ops' has following api: - init & uninit The initialize related functions. Algorithm can do some private setting. - create_tfm & release_tfm The 'crypto_tfm' related functions. Algorithm has different tfm name in crypto. But they both has a member named tfm, so use tfm to stand for algorithm handle. The benchmark has provides the tfm array. - create_req & release_req The 'crypto_req' related functions. The callbacks should create a 'reqnum' 'crypto_req' group in struct 'crypto_bm_base'. And the also suggest prepare the request data in this function. In order to avoid the false high performance of the algorithm caused by the false cache and TLB hit rate, the size of data set should be larger than 'crypto_req' number. - perf The request sending functions. The registrant should use parameter 'loop' to send requests repeatly. And update the count in struct 'crypto_bm_thread_data'. Then consider the parameters that user can configure. Generally speaking, the following parameters will affect the performance of the algorithm: tfm number, request number, block size, numa node. And some parameters will affect the stability of performance: testing time and requests sent number. To sum up, the benchmark has following parameters: - algorithm The testing algorithm name. Showed in /proc/crypto. - algtype The testing algorithm class. Can get the algorithm class by echo 'algtype' to /sys/module/crypto_benchmark/parameters/help. - inputsize The testing length that can greatly impact performance. Such as data size for compress or key length for encryption. - loop The testing loop times. Avoid performance fluctuations caused by environment. - numamask The testing bind numamask. Used for allocate memory, create threads and create 'crypto_tfm'. - optype The testing algorithm operation type. Can get the algorithm available operation types by cat /sys/module/crypto_benchmark/parameters/help with specified 'algtype'. - reqnum The testing request number for per tfm. Used for test asynchrony api performance. - threadnum The testing thread number. To simplify model, create a 'crypto_tfm' per thread. - time The testing time. Used for stop the test thread. - run Start or stop the test. Users can configure parameters under /sys/modules/crypto_benchmark/parameters/. Then echo 1 to 'run' to start the test. And if they want to stop the test, just echo 0 to 'run'. Signed-off-by: Yang Shen --- crypto/Kconfig | 2 + crypto/Makefile | 5 + crypto/benchmark/Kconfig | 11 + crypto/benchmark/Makefile | 3 + crypto/benchmark/benchmark.c | 509 +++++++++++++++++++++++++++++++++++ crypto/benchmark/benchmark.h | 76 ++++++ 6 files changed, 606 insertions(+) create mode 100644 crypto/benchmark/Kconfig create mode 100644 crypto/benchmark/Makefile create mode 100644 crypto/benchmark/benchmark.c create mode 100644 crypto/benchmark/benchmark.h -- 2.24.0 diff --git a/crypto/Kconfig b/crypto/Kconfig index 40423a14f86f..a0f618f349fc 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1438,4 +1438,6 @@ source "drivers/crypto/Kconfig" source "crypto/asymmetric_keys/Kconfig" source "certs/Kconfig" +source "crypto/benchmark/Kconfig" + endif # if CRYPTO diff --git a/crypto/Makefile b/crypto/Makefile index a6f94e04e1da..67edf4e1337c 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -212,3 +212,8 @@ obj-$(CONFIG_CRYPTO_SIMD) += crypto_simd.o # Key derivation function # obj-$(CONFIG_CRYPTO_KDF800108_CTR) += kdf_sp800108.o + +# +# crypto benchmark +# +obj-y += benchmark/ diff --git a/crypto/benchmark/Kconfig b/crypto/benchmark/Kconfig new file mode 100644 index 000000000000..abee14ba8e40 --- /dev/null +++ b/crypto/benchmark/Kconfig @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0 + +config CRYPTO_BENCHMARK + bool "Testing performance of crypto algorithms" + depends on CRYPTO + help + This option support test crypto async api performance. + Select this if you want to test crypto algorithms performance + conveniently. + Before use it, you should check whether the algorithm class is + supported. diff --git a/crypto/benchmark/Makefile b/crypto/benchmark/Makefile new file mode 100644 index 000000000000..5244178e14c4 --- /dev/null +++ b/crypto/benchmark/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_CRYPTO_BENCHMARK) += crypto_benchmark.o +crypto_benchmark-objs += benchmark.o diff --git a/crypto/benchmark/benchmark.c b/crypto/benchmark/benchmark.c new file mode 100644 index 000000000000..9a833b277d87 --- /dev/null +++ b/crypto/benchmark/benchmark.c @@ -0,0 +1,509 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 HiSilicon Limited. + */ +#include +#include +#include +#include +#include +#include +#include + +#include "benchmark.h" + +enum crypto_bm_status { + CRYPTO_BM_STOP, + CRYPTO_BM_RUN, +}; + +enum crypto_bm_alg { + CRYPTO_BM_ALG_MAX, +}; + +struct crypto_bm_alg_ops { + const char *alg; + int (*init)(struct crypto_bm_base *base); + void (*uninit)(struct crypto_bm_base *base); + int (*create_tfm)(struct crypto_bm_base *base, u32 idx); + void (*release_tfm)(struct crypto_bm_base *base, u32 idx); + int (*create_req)(struct crypto_bm_base *base, u32 idx); + void (*release_req)(struct crypto_bm_base *base, u32 idx); + int (*perf)(struct crypto_bm_thread_data *data); +}; + +struct { + wait_queue_head_t wq; + atomic_t count; +} crypto_bm_wq = { 0 }; + +#define CRYPTO_BM_THREAD_MAX 1024U + +#define algorithm_desc "Testing algorithm name" +#define algtype_desc "Testing algorithm type, according to enum crypto_bm_alg" +#define inputsize_desc "Testing input size" +#define loop_desc "Testing loop times, the unit is kile, 0/1(default, 1 ktimes), 2(2 ktimes) ..." +#define numamask_desc "Testing bind numamask, 0(default, not bind), 1(bind to node 0), 3(bind to node0 and node1) ..." +#define optype_desc "Testing algorithm operation type 0 && 1: 0(default, compress/encipher), 1(decompress/decipher)" +#define reqnum_desc "Testing request number for per tfm, 0/1 (default 1 request), 2(2 requests) ..." +#define threadnum_desc "Testing thread number, one 'crypto_tfm' per thread. 0/1 (default 1 thread), 2(2 threads) ..." +#define time_desc "Testing time, the unit is second, 0/1 (default 1 s), 2(2 s) ..." +#define run_desc "Start/stop all the tests based on the configuration, 0(default, not run, stop), or run" + +static atomic_t benchmark_status; + +static struct crypto_bm_attrs benchmark_attrs = { 0 }; + +static struct crypto_bm_base benchmark_base = { + .attrs = &benchmark_attrs, +}; + +static struct crypto_bm_thread_data thread_data[CRYPTO_BM_THREAD_MAX] = { 0 }; + +static struct task_struct *crypto_bm_perf[CRYPTO_BM_THREAD_MAX] = { NULL }; +static struct task_struct *test_thread; + +static struct crypto_bm_alg_ops benchmark_ops[] = { + { + /* sentinel */ + } +}; + +static int crypto_bm_algorithm_param_set(const char *val, const struct kernel_param *kp) +{ + char *s = strstrip((char *)val); + + if (atomic_read(&benchmark_status)) + return -EBUSY; + + if (!crypto_has_alg(s, 0, 0)) { + pr_err("failed to find the algorithm %s\n", s); + return -EINVAL; + } + + return param_set_charp(s, kp); +} + +static const struct kernel_param_ops alg_ops = { + .set = crypto_bm_algorithm_param_set, + .get = param_get_charp, +}; + +module_param_cb(algorithm, &alg_ops, &benchmark_attrs.algorithm, 0644); +MODULE_PARM_DESC(algorithm, algorithm_desc); + +static int crypto_bm_numamask_param_set(const char *val, const struct kernel_param *kp) +{ + if (atomic_read(&benchmark_status)) + return -EBUSY; + + return param_set_hexulong(val, kp); +} + +static const struct kernel_param_ops numamask_ops = { + .set = crypto_bm_numamask_param_set, + .get = param_get_hexulong, +}; + +module_param_cb(numamask, &numamask_ops, &benchmark_attrs.numamask, 0644); +MODULE_PARM_DESC(numamask, numamask_desc); + +#define MODULE_PARAMETER_DEF(xxx) \ +static int xxx##_set(const char *val, const struct kernel_param *kp) \ +{ \ + u32 n; \ + int ret; \ + if (atomic_read(&benchmark_status)) \ + return -EBUSY; \ + ret = kstrtou32(val, 10, &n); \ + if (ret != 0) \ + return -EINVAL; \ + return param_set_uint(val, kp); \ +} \ +static const struct kernel_param_ops xxx##_ops = { \ + .set = xxx##_set, \ + .get = param_get_uint \ +}; \ +module_param_cb(xxx, &xxx##_ops, &benchmark_attrs.xxx, 0644); \ +MODULE_PARM_DESC(xxx, xxx##_desc) + +MODULE_PARAMETER_DEF(algtype); +MODULE_PARAMETER_DEF(inputsize); +MODULE_PARAMETER_DEF(loop); +MODULE_PARAMETER_DEF(optype); +MODULE_PARAMETER_DEF(reqnum); +MODULE_PARAMETER_DEF(threadnum); +MODULE_PARAMETER_DEF(time); + +static int crypto_bm_check_params(struct crypto_bm_attrs *attrs) +{ + if (attrs->algorithm == NULL) { + pr_err("algorithm is NULL\n"); + return -EINVAL; + } + + if (attrs->algtype >= CRYPTO_BM_ALG_MAX) { + pr_err("algorithm type %d is invalid\n", attrs->algtype); + return -EINVAL; + } + + if (attrs->inputsize == 0) { + pr_err("input size is 0\n"); + return -EINVAL; + } + + if (attrs->threadnum >= CRYPTO_BM_THREAD_MAX) { + pr_err("thread number is bigger than %u\n", CRYPTO_BM_THREAD_MAX); + return -EINVAL; + } + + return 0; +} + +static void crypto_bm_set_default_params(struct crypto_bm_attrs *attrs) +{ + attrs->loop = (attrs->loop == 0) ? 1 : attrs->loop; + attrs->reqnum = (attrs->reqnum == 0) ? 1 : attrs->reqnum; + attrs->threadnum = (attrs->threadnum == 0) ? 1 : attrs->threadnum; + attrs->time = (attrs->time == 0) ? 1 : attrs->time; +} + +static int crypto_bm_init_alg(struct crypto_bm_base *base) +{ + u32 idx = base->attrs->algtype; + + return benchmark_ops[idx].init(base); +} + +static void crypto_bm_uninit_alg(struct crypto_bm_base *base) +{ + u32 idx = base->attrs->algtype; + + benchmark_ops[idx].uninit(base); +} + +static int crypto_bm_create_tfm(struct crypto_bm_base *base) +{ + struct crypto_bm_attrs *attrs = base->attrs; + int i, ret, nodes, sbit, count = 0; + u32 threadnum = attrs->threadnum; + u32 threadpernode, threadrest; + u32 idx = attrs->algtype; + + base->gthread = kcalloc(threadnum, sizeof(*base->gthread), GFP_KERNEL); + if (!base->gthread) + return -ENOMEM; + + nodes = bitmap_weight(&attrs->numamask, MAX_NUMNODES); + + if (nodes == 0) { + for (i = 0; i < threadnum; i++) { + base->gthread[i].id = i; + base->gthread[i].node = NUMA_NO_NODE; + ret = benchmark_ops[idx].create_tfm(base, i); + if (ret) + goto out_free_tfm; + } + } else { + threadpernode = threadnum / nodes; + threadrest = threadnum % nodes; + for_each_set_bit(sbit, (unsigned long *)&attrs->numamask, MAX_NUMNODES) { + int start = count * threadpernode; + int end = (count + 1) * threadpernode; + + end += (++count == nodes) ? threadrest : 0; + for (i = start; i < end; i++) { + base->gthread[i].id = i; + base->gthread[i].node = sbit; + ret = benchmark_ops[idx].create_tfm(base, i); + if (ret) + goto out_free_tfm; + } + } + } + + return 0; + +out_free_tfm: + for (i--; i >= 0; i--) + benchmark_ops[idx].release_tfm(base, i); + + kfree(base->gthread); + + return ret; +} + +static void crypto_bm_release_tfm(struct crypto_bm_base *base) +{ + u32 threadnum = base->attrs->threadnum; + u32 idx = base->attrs->algtype; + int i; + + for (i = 0; i < threadnum; i++) + benchmark_ops[idx].release_tfm(base, i); + + kfree(base->gthread); +} + +static int crypto_bm_create_req(struct crypto_bm_base *base) +{ + u32 threadnum = base->attrs->threadnum; + u32 idx = base->attrs->algtype; + int i, ret; + + for (i = 0; i < threadnum; i++) { + ret = benchmark_ops[idx].create_req(base, i); + if (ret) + goto out_release_req; + } + + return 0; + +out_release_req: + for (i--; i >= 0 ; i--) + benchmark_ops[idx].release_req(base, i); + + return ret; +} + +static void crypto_bm_release_req(struct crypto_bm_base *base) +{ + u32 threadnum = base->attrs->threadnum; + u32 idx = base->attrs->algtype; + int i; + + for (i = 0; i < threadnum; i++) + benchmark_ops[idx].release_req(base, i); +} + +static int crypto_bm_test_perf(void *data) +{ + struct crypto_bm_thread_data *tdata = data; + struct crypto_bm_base *base = tdata->base; + struct crypto_bm_attrs *attrs = base->attrs; + unsigned long endtime = jiffies + attrs->time * HZ; + u32 idx = attrs->algtype; + int ret; + + do { + if (kthread_should_stop()) + break; + + if (time_after(jiffies, endtime)) + break; + + ret = benchmark_ops[idx].perf(tdata); + if (ret) + break; + } while (1); + + crypto_bm_perf[tdata->threadid] = NULL; + atomic_dec(&crypto_bm_wq.count); + wake_up(&crypto_bm_wq.wq); + + return ret; +} + +static void crypto_bm_show_perf(u64 time) +{ + u32 threadnum = benchmark_attrs.threadnum; + u32 inputsize = benchmark_attrs.inputsize; + u64 throughput, pps, reqsum = 0; + int i; + + for (i = 0; i < threadnum; i++) + reqsum += atomic_read(&thread_data[i].count.recv_req); + + /* + * reqsum * inputsize (bytes) / (1024 * 1024) + * throughput = -------------------------------------------- (MB/s) + * time (ns) / 1000000000 + */ + throughput = reqsum * inputsize * 953 / (time); + + /* + * reqsum / 1024 + * pps = ------------------- + * time / 1000000000 + */ + pps = reqsum * 976562 / (time); + + pr_err("Crypto benchmark result:\n" + "\t throughput \t pps \t\t time\n" + "\t %llu MB/s \t %llu kPP/s \t %llu ms\n", + throughput, pps, time / 1000000); +} + +static int crypto_bm_test(void *data) +{ + struct crypto_bm_base *base = data; + u32 threadnum = base->attrs->threadnum; + struct timespec64 begin, end; + int i, ret, node; + + init_waitqueue_head(&crypto_bm_wq.wq); + atomic_set(&crypto_bm_wq.count, threadnum); + + memset(crypto_bm_perf, 0, sizeof(*crypto_bm_perf) * threadnum); + + ret = crypto_bm_init_alg(base); + if (ret) + goto out_set_stop; + + ret = crypto_bm_create_tfm(base); + if (ret) + goto out_uninit; + + ret = crypto_bm_create_req(base); + if (ret) + goto out_free_tfm; + + + for (i = 0; i < threadnum; i++) { + node = base->gthread[i].node; + thread_data[i].threadid = i; + thread_data[i].base = base; + memset(&thread_data[i].count, 0, sizeof(thread_data[i].count)); + crypto_bm_perf[i] = kthread_create_on_node(crypto_bm_test_perf, &thread_data[i], + node, "crypto_bm_perf-%d", i); + if (IS_ERR(crypto_bm_perf[i])) { + ret = PTR_ERR(crypto_bm_perf[i]); + crypto_bm_perf[i] = NULL; + pr_err("failed to create %dth performance thread, ret = %d\n", i, ret); + goto out_stop_thread; + } + kthread_bind_mask(crypto_bm_perf[i], cpumask_of_node(node)); + } + i = 0; + + ktime_get_real_ts64(&begin); + for (i = 0; i < threadnum; i++) + wake_up_process(crypto_bm_perf[i]); + wait_event_interruptible(crypto_bm_wq.wq, atomic_read(&crypto_bm_wq.count) == 0); + ktime_get_real_ts64(&end); + + crypto_bm_show_perf(timespec64_to_ns(&end) - timespec64_to_ns(&begin)); + +out_stop_thread: + for (i--; i >= 0; i--) { + if (!crypto_bm_perf[i]) + continue; + kthread_stop(crypto_bm_perf[i]); + crypto_bm_perf[i] = NULL; + } + + crypto_bm_release_req(base); + +out_free_tfm: + crypto_bm_release_tfm(base); + +out_uninit: + crypto_bm_uninit_alg(base); + +out_set_stop: + atomic_set(&benchmark_status, CRYPTO_BM_STOP); + test_thread = NULL; + + return ret; +} + +static int crypto_bm_start_test(struct crypto_bm_base *base) +{ + int ret = 0; + + if (atomic_cmpxchg(&benchmark_status, CRYPTO_BM_STOP, CRYPTO_BM_RUN)) { + pr_err("Crypto benchmark is busy now, please try later!\n"); + return -EBUSY; + } + + test_thread = kthread_run(crypto_bm_test, base, "crypto_bm_test"); + if (IS_ERR(test_thread)) + ret = PTR_ERR(test_thread); + + return ret; +} + +static void crypto_bm_stop_test(void) +{ + u32 threadnum = benchmark_attrs.threadnum; + int i, ret; + + if (!atomic_read(&benchmark_status)) + return; + + for (i = 0; i < threadnum; i++) { + if (!crypto_bm_perf[i]) + continue; + ret = kthread_stop(crypto_bm_perf[i]); + if (ret) + pr_err("failed to stop %dth performance thread, ret = %d\n", i, ret); + crypto_bm_perf[i] = NULL; + } + + if (test_thread) { + ret = kthread_stop(test_thread); + if (ret) + pr_err("failed to stop test thread, ret = %d\n", ret); + } + + atomic_set(&benchmark_status, CRYPTO_BM_STOP); +} + +static int run_set(const char *val, const struct kernel_param *kp) +{ + int ret; + u32 n; + + ret = kstrtou32(val, 10, &n); + if (ret != 0) + return -EINVAL; + + if (n == 0) { + crypto_bm_stop_test(); + } else { + ret = crypto_bm_check_params(&benchmark_attrs); + if (ret) + return ret; + + crypto_bm_set_default_params(&benchmark_attrs); + + ret = crypto_bm_start_test(&benchmark_base); + if (ret) { + pr_err("failed to start test, ret = %d\n", ret); + return ret; + } + pr_info("run set: algorithm %s, algtype %s, inputsize %d, loop %d, numamask 0x%lx, optype %d, reqnum %d, threadnum %d, time %d.\n", + benchmark_attrs.algorithm, benchmark_ops[benchmark_attrs.algtype].alg, + benchmark_attrs.inputsize, benchmark_attrs.loop, benchmark_attrs.numamask, + benchmark_attrs.optype, benchmark_attrs.reqnum, benchmark_attrs.threadnum, + benchmark_attrs.time); + } + + return param_set_int(val, kp); +} + +static const struct kernel_param_ops run_ops = { + .set = run_set, + .get = param_get_uint, +}; + +static u32 run; +module_param_cb(run, &run_ops, &run, 0644); +MODULE_PARM_DESC(run, run_desc); + +static int __init crypto_bm_init(void) +{ + atomic_set(&benchmark_status, CRYPTO_BM_STOP); + + return 0; +} + +static void __exit crypto_bm_exit(void) +{ +} + +module_init(crypto_bm_init); +module_exit(crypto_bm_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Driver for testing performance of crypto algorithms"); diff --git a/crypto/benchmark/benchmark.h b/crypto/benchmark/benchmark.h new file mode 100644 index 000000000000..84cb49af81ba --- /dev/null +++ b/crypto/benchmark/benchmark.h @@ -0,0 +1,76 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022 HiSilicon Limited. + */ +#ifndef CRYPTO_BM_H +#define CRYPTO_BM_H + +#include +#include +#include +#include +#include +#include +#include + +/** + * struct crypto_bm_attrs - crypto benchmark attributes configured by users. + * + * @algorithm: The algorithm name registered in crypto. + * @algtype: The algorithm class list in enum crypto_bm_alg. Used to + * choose the crypto_bm_ops. + * @inputsize: The testing length that can greatly impact performance. + * Such as data size for compress or key length for encryption. + * @loop: The request sending loop times. The value is 1000 times + * of user's setting. + * @numamask: The mask of testing bind numa nodes. + * @optype: The algorithm test operation. Defined by the algorithm self. + * @reqnum: The crypto request number of a tfm. + * @threadnum: The test thread number. And it is equal to tfm number. + * @time: The testing time. + */ +struct crypto_bm_attrs { + char *algorithm; + u32 algtype; + u32 inputsize; + u32 loop; + unsigned long numamask; + u32 optype; + u32 reqnum; + u32 threadnum; + u32 time; +}; + +/** + * struct crypto_bm_base - crypto benchmark test objects. + * + * @attrs: The test configuration. + * @gthread: A array storing resources related to the test thread. + */ +struct crypto_bm_base { + struct crypto_bm_attrs *attrs; + struct { + u32 id; + int node; + void *tfm; + void **req; + } *gthread; +}; + +/** + * struct crypto_bm_thread_data - crypto benchmark test thread common information. + * + * @threadid: The test thread number. + * @count: Count the thread test request numbers. + * @base: crypto benchmark test objects. + */ +struct crypto_bm_thread_data { + int threadid; + struct { + atomic_t send_req; + atomic_t recv_req; + } count; + struct crypto_bm_base *base; +} ____cacheline_aligned; + +#endif From patchwork Mon Sep 19 12:05:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shen X-Patchwork-Id: 607427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5882C6FA90 for ; Mon, 19 Sep 2022 12:08:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230083AbiISMIJ (ORCPT ); Mon, 19 Sep 2022 08:08:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230060AbiISMIF (ORCPT ); Mon, 19 Sep 2022 08:08:05 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 367782AC43; Mon, 19 Sep 2022 05:08:02 -0700 (PDT) Received: from dggpemm500023.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MWNc839tQzMn3F; Mon, 19 Sep 2022 20:03:20 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 Received: from localhost.localdomain (10.67.164.66) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 From: Yang Shen To: , CC: , , Subject: [RFC PATCH 3/6] crytpo: benchmark - support compression/decompresssion Date: Mon, 19 Sep 2022 20:05:34 +0800 Message-ID: <20220919120537.39258-4-shenyang39@huawei.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220919120537.39258-1-shenyang39@huawei.com> References: <20220919120537.39258-1-shenyang39@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Register compression algorithms to crypto benchmark. Users can echo 0 to 'algtype' to appoint the compression/decompression. Due to the compression protocol, the tool cannot set the compressed data length to 'inputsize'. So in this algorithm class, the 'inputsize' is used as origin data size in decompression. To avoid the false high performance of the algorithm caused by the false cache and TLB hit rate, the size of data set is four times of crypto_req number at most. Signed-off-by: Yang Shen --- crypto/benchmark/Makefile | 2 +- crypto/benchmark/benchmark.c | 11 + crypto/benchmark/bm_comp.c | 425 +++++++++++++++++++++++++++++++++++ crypto/benchmark/bm_comp.h | 18 ++ 4 files changed, 455 insertions(+), 1 deletion(-) create mode 100644 crypto/benchmark/bm_comp.c create mode 100644 crypto/benchmark/bm_comp.h -- 2.24.0 diff --git a/crypto/benchmark/Makefile b/crypto/benchmark/Makefile index 5244178e14c4..f638535442ba 100644 --- a/crypto/benchmark/Makefile +++ b/crypto/benchmark/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CRYPTO_BENCHMARK) += crypto_benchmark.o -crypto_benchmark-objs += benchmark.o +crypto_benchmark-objs += benchmark.o bm_comp.o diff --git a/crypto/benchmark/benchmark.c b/crypto/benchmark/benchmark.c index 9a833b277d87..b5dcf5829b22 100644 --- a/crypto/benchmark/benchmark.c +++ b/crypto/benchmark/benchmark.c @@ -11,6 +11,7 @@ #include #include "benchmark.h" +#include "bm_comp.h" enum crypto_bm_status { CRYPTO_BM_STOP, @@ -18,6 +19,7 @@ enum crypto_bm_status { }; enum crypto_bm_alg { + CRYPTO_BM_COMP, CRYPTO_BM_ALG_MAX, }; @@ -65,6 +67,15 @@ static struct task_struct *test_thread; static struct crypto_bm_alg_ops benchmark_ops[] = { { + .alg = "CRYPTO_COMPRESS", + .init = crypto_bm_init_comp, + .uninit = crypto_bm_uninit_comp, + .create_tfm = crypto_bm_create_tfm_comp, + .release_tfm = crypto_bm_release_tfm_comp, + .create_req = crypto_bm_create_req_comp, + .release_req = crypto_bm_release_req_comp, + .perf = crypto_bm_perf_comp, + }, { /* sentinel */ } }; diff --git a/crypto/benchmark/bm_comp.c b/crypto/benchmark/bm_comp.c new file mode 100644 index 000000000000..2772a8e86e2e --- /dev/null +++ b/crypto/benchmark/bm_comp.c @@ -0,0 +1,425 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 HiSilicon Limited. + */ +#include +#include + +#include "benchmark.h" +#include "bm_comp.h" + +#define COMP_BUF_SIZE 1024 +#define REQ_NUM 1024 +#define DATAPERREQ 4 + +enum crypto_bm_comp_optype { + CRYPTO_BM_COMPRESS, + CRYPTO_BM_DECOMPRESS, + CRYPTO_BM_OPS_MAX, +}; + +struct crypto_bm_comp_buffer { + void *input; + void *output; + struct scatterlist *src; + struct scatterlist *dst; +}; + +struct crypto_bm_comp_cb_data { + atomic_t is_used; + struct crypto_bm_thread_data *tdata; +}; + +struct crypto_bm_comp_data { + u32 input_size; + u32 output_size; + u32 last_used; + struct crypto_bm_comp_buffer *buffers; + struct crypto_bm_comp_cb_data *cb_datas; +}; + +struct crypto_bm_comp_testvec { + int inlen; + int outlen; + char input[COMP_BUF_SIZE]; + char output[COMP_BUF_SIZE]; +}; + +struct crypto_bm_comp_test_func { + int (*testfun)(struct acomp_req *req); +}; + +static int dataperreq; + +static int totalreq; + +static struct crypto_bm_comp_data *data_array; + +static const struct crypto_bm_comp_testvec comp_compress_tv = { + .inlen = 70, + .input = "Join us now and share the software " + "Join us now and share the software ", +}; + +static const struct crypto_bm_comp_test_func testfunc[] = { + { + .testfun = crypto_acomp_compress, + }, { + .testfun = crypto_acomp_decompress, + }, { + /* sentinel */ + } +}; + +static void crypto_bm_comp_cb(struct crypto_async_request *base, int err); + +int crypto_bm_init_comp(struct crypto_bm_base *base) +{ + struct crypto_bm_attrs *attrs = base->attrs; + + if (attrs->optype >= CRYPTO_BM_OPS_MAX) { + pr_err("Optype should be 0 for compression or 1 for decompression!\n"); + return -ENOMEM; + } + + if (attrs->reqnum * DATAPERREQ >= REQ_NUM) + totalreq = attrs->reqnum * DATAPERREQ; + else + totalreq = REQ_NUM; + + dataperreq = totalreq / attrs->reqnum; + + data_array = kcalloc(attrs->threadnum, sizeof(*data_array), GFP_KERNEL); + if (!data_array) + return -ENOMEM; + + return 0; +} + +void crypto_bm_uninit_comp(struct crypto_bm_base *base) +{ + kfree(data_array); +} + +int crypto_bm_create_tfm_comp(struct crypto_bm_base *base, u32 idx) +{ + char *alg = base->attrs->algorithm; + int node = base->gthread[idx].node; + int ret = 0; + + base->gthread[idx].tfm = crypto_alloc_acomp_node(alg, 0, 0, node); + if (IS_ERR(base->gthread[idx].tfm)) { + ret = PTR_ERR(base->gthread[idx].tfm); + pr_err("failed to alloc %dth acomp, ret = %d\n", idx, ret); + } + + return ret; +} + +void crypto_bm_release_tfm_comp(struct crypto_bm_base *base, u32 idx) +{ + crypto_free_acomp(base->gthread[idx].tfm); +} + +static void crypto_bm_comp_copy_data_compress(u32 idx) +{ + struct crypto_bm_comp_data *data = &data_array[idx]; + u32 block, inlen, inputsize = data->input_size; + void *buffer; + int i, j; + + block = DIV_ROUND_UP(inputsize, comp_compress_tv.inlen); + for (i = 0; i < totalreq; i++) { + inlen = inputsize; + buffer = data->buffers[i].input; + for (j = 0; j < block; j++) { + memcpy(buffer, comp_compress_tv.input, + j == block - 1 ? inlen : comp_compress_tv.inlen); + buffer += comp_compress_tv.inlen; + inlen -= comp_compress_tv.inlen; + } + } +} + +static int crypto_bm_comp_copy_data_decompress(struct crypto_bm_base *base, u32 idx) +{ + struct crypto_bm_comp_data *data = &data_array[idx]; + struct crypto_acomp *acomp = base->gthread[idx].tfm; + struct crypto_wait wait; + struct acomp_req *req; + u32 block, inlen; + void *buffer; + int i, ret; + + req = acomp_request_alloc(acomp); + if (!req) + return -ENOMEM; + + inlen = data->input_size; + block = DIV_ROUND_UP(inlen, comp_compress_tv.inlen); + buffer = data->buffers[0].input; + for (i = 0; i < block; i++) { + memcpy(buffer, comp_compress_tv.input, + i == block - 1 ? inlen : comp_compress_tv.inlen); + buffer += comp_compress_tv.inlen; + inlen -= comp_compress_tv.inlen; + } + + /* + * For decompression, the tool need to prepare compressed data according + * to crypto_bm_attrs.inputsize. And here it is hard to make the compressed + * data length equal to 'inputsize' value, so make the origin data length + * equal to 'inputsize' value. + */ + crypto_init_wait(&wait); + acomp_request_set_callback(req, 0, crypto_req_done, &wait); + acomp_request_set_params(req, data->buffers[0].src, data->buffers[0].dst, + data->input_size, data->output_size); + + ret = crypto_wait_req(crypto_acomp_compress(req), &wait); + if (ret) { + pr_err("failed to prepare decompression data.\n"); + goto out_free_req; + } + + for (i = 0; i < totalreq; i++) + memcpy(data->buffers[i].input, data->buffers[0].output, req->dlen); + +out_free_req: + acomp_request_free(req); + + return ret; +} + +static int crypto_bm_comp_init_data(struct crypto_bm_base *base, u32 idx) +{ + struct crypto_bm_comp_data *data = &data_array[idx]; + int i, ret, node = base->gthread[idx].node; + struct crypto_bm_comp_buffer *buffer; + u32 reqnum = base->attrs->reqnum; + u32 optype = base->attrs->optype; + + data->input_size = base->attrs->inputsize; + data->output_size = base->attrs->inputsize; + + data->buffers = kcalloc_node(1, sizeof(*data->buffers) * totalreq, GFP_KERNEL, node); + if (!data->buffers) + return -ENOMEM; + + data->cb_datas = kcalloc_node(1, sizeof(*data->cb_datas) * reqnum, GFP_KERNEL, node); + if (!data->cb_datas) { + ret = -ENOMEM; + goto out_free_buffers; + } + + for (i = 0; i < totalreq; i++) { + buffer = &data->buffers[i]; + buffer->src = kcalloc_node(1, sizeof(struct scatterlist), GFP_KERNEL, node); + if (!buffer->src) { + ret = -ENOMEM; + goto out_free_src; + } + + buffer->dst = kcalloc_node(1, sizeof(struct scatterlist), GFP_KERNEL, node); + if (!buffer->dst) { + ret = -ENOMEM; + goto out_free_dst; + } + + buffer->input = kcalloc_node(1, data->input_size, GFP_KERNEL, node); + if (!buffer->input) { + ret = -ENOMEM; + goto out_free_input; + } + + buffer->output = kcalloc_node(1, data->output_size, GFP_KERNEL, node); + if (!buffer->output) { + ret = -ENOMEM; + goto out_free_output; + } + + sg_init_one(buffer->src, buffer->input, data->input_size); + sg_init_one(buffer->dst, buffer->output, data->output_size); + } + + if (optype == CRYPTO_BM_COMPRESS) { + crypto_bm_comp_copy_data_compress(idx); + } else { + ret = crypto_bm_comp_copy_data_decompress(base, idx); + if (ret) { + i--; + goto out_free_output; + } + } + + return 0; + +out_free_output: + kfree(buffer->input); + +out_free_input: + kfree(buffer->dst); + +out_free_dst: + kfree(buffer->src); + +out_free_src: + for (i--; i >= 0; i--) { + buffer = &data->buffers[i]; + kfree(buffer->src); + kfree(buffer->dst); + kfree(buffer->input); + kfree(buffer->output); + } + + kfree(data->cb_datas); + +out_free_buffers: + kfree(data->buffers); + + return ret; +} + +static void crypto_bm_comp_uninit_data(struct crypto_bm_base *base, u32 idx) +{ + struct crypto_bm_comp_data *data = &data_array[idx]; + struct crypto_bm_comp_buffer *buffer; + int i; + + for (i = 0; i < totalreq; i++) { + buffer = &data->buffers[i]; + kfree(buffer->src); + kfree(buffer->dst); + kfree(buffer->input); + kfree(buffer->output); + } + + kfree(data->cb_datas); + kfree(data->buffers); +} + +static int crypto_bm_comp_alloc_req(struct crypto_bm_base *base, u32 idx) +{ + struct crypto_bm_comp_data *data = &data_array[idx]; + int node = base->gthread[idx].node; + u32 reqnum = base->attrs->reqnum; + struct acomp_req *req; + int i; + + base->gthread[idx].req = kcalloc_node(reqnum, sizeof(struct acomp_req *), GFP_KERNEL, node); + if (!base->gthread[idx].req) + return -ENOMEM; + + for (i = 0; i < reqnum; i++) { + req = acomp_request_alloc(base->gthread[idx].tfm); + if (!req) { + pr_err("failed to allocate acomp request\n"); + goto out_free_req; + } + + acomp_request_set_callback(req, 0, crypto_bm_comp_cb, &data->cb_datas[i]); + base->gthread[idx].req[i] = req; + } + + return 0; + +out_free_req: + for (i--; i >= 0; i--) + acomp_request_free(base->gthread[idx].req[i]); + + kfree(base->gthread[idx].req); + + return -EINVAL; +} + +static void crypto_bm_comp_free_req(struct crypto_bm_base *base, u32 idx) +{ + u32 reqnum = base->attrs->reqnum; + int i; + + for (i = 0; i < reqnum; i++) + acomp_request_free(base->gthread[idx].req[i]); +} + +int crypto_bm_create_req_comp(struct crypto_bm_base *base, u32 idx) +{ + int ret; + + ret = crypto_bm_comp_init_data(base, idx); + if (ret) + return ret; + + ret = crypto_bm_comp_alloc_req(base, idx); + if (ret) + goto out_free_buf; + + return 0; + +out_free_buf: + crypto_bm_comp_uninit_data(base, idx); + + return ret; +} + +void crypto_bm_release_req_comp(struct crypto_bm_base *base, u32 idx) +{ + crypto_bm_comp_free_req(base, idx); + crypto_bm_comp_uninit_data(base, idx); +} + +static void crypto_bm_comp_cb(struct crypto_async_request *base, int err) +{ + struct crypto_bm_comp_cb_data *data = base->data; + + atomic_inc(&data->tdata->count.recv_req); + atomic_set(&data->is_used, 0); +} + +int crypto_bm_perf_comp(struct crypto_bm_thread_data *data) +{ + struct crypto_bm_base *base = data->base; + int i, j, ret, last_used, send_req = 0; + u32 loop = base->attrs->loop * 1000; + u32 reqnum = base->attrs->reqnum; + u32 threadid = data->threadid; + struct crypto_bm_comp_data *comp_data = &data_array[threadid]; + struct crypto_bm_comp_buffer *buffer; + struct acomp_req *req; + + for (i = 0; i < reqnum; i++) + comp_data->cb_datas[i].tdata = data; + + for (i = 0; i < loop; i++) { + for (j = 0; j < reqnum; j++) { + if (atomic_read(&comp_data->cb_datas[j].is_used)) + continue; + req = base->gthread[threadid].req[j]; + last_used = comp_data->last_used; + buffer = &comp_data->buffers[last_used + j * dataperreq]; + acomp_request_set_params(req, buffer->src, buffer->dst, + comp_data->input_size, comp_data->output_size); + atomic_set(&comp_data->cb_datas[j].is_used, 1); + ret = testfunc[base->attrs->optype].testfun(req); + if (!ret) { + atomic_inc(&data->count.recv_req); + atomic_set(&comp_data->cb_datas[j].is_used, 0); + } + if (unlikely(ret && ret != -EINPROGRESS && ret != -EBUSY)) { + pr_err("failed to compress req, ret %d\n", ret); + atomic_set(&comp_data->cb_datas[j].is_used, 0); + break; + } + ret = 0; + comp_data->last_used = (last_used + 1) % dataperreq; + send_req++; + } + } + + atomic_add(send_req, &data->count.send_req); + send_req = atomic_read(&data->count.send_req); + + while (atomic_read(&data->count.recv_req) != send_req) + ; + + return ret; +} diff --git a/crypto/benchmark/bm_comp.h b/crypto/benchmark/bm_comp.h new file mode 100644 index 000000000000..78b45f8b22a6 --- /dev/null +++ b/crypto/benchmark/bm_comp.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022 HiSilicon Limited. + */ +#ifndef CRYPTO_BM_COMP_H +#define CRYPTO_BM_COMP_H + +#include + +int crypto_bm_init_comp(struct crypto_bm_base *base); +void crypto_bm_uninit_comp(struct crypto_bm_base *base); +int crypto_bm_create_tfm_comp(struct crypto_bm_base *base, u32 idx); +void crypto_bm_release_tfm_comp(struct crypto_bm_base *base, u32 idx); +int crypto_bm_create_req_comp(struct crypto_bm_base *base, u32 idx); +void crypto_bm_release_req_comp(struct crypto_bm_base *base, u32 idx); +int crypto_bm_perf_comp(struct crypto_bm_thread_data *data); + +#endif From patchwork Mon Sep 19 12:05:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shen X-Patchwork-Id: 608039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C181CECAAD3 for ; Mon, 19 Sep 2022 12:08:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230119AbiISMIL (ORCPT ); Mon, 19 Sep 2022 08:08:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230053AbiISMIF (ORCPT ); Mon, 19 Sep 2022 08:08:05 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 916402B25B; Mon, 19 Sep 2022 05:08:02 -0700 (PDT) Received: from dggpemm500023.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4MWNg512lbzHnxJ; Mon, 19 Sep 2022 20:05:53 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 Received: from localhost.localdomain (10.67.164.66) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 From: Yang Shen To: , CC: , , Subject: [RFC PATCH 4/6] crypto: benchmark - add help information Date: Mon, 19 Sep 2022 20:05:35 +0800 Message-ID: <20220919120537.39258-5-shenyang39@huawei.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220919120537.39258-1-shenyang39@huawei.com> References: <20220919120537.39258-1-shenyang39@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add a new module parameters 'help' to make users understand the benchmark module parameters. And due to the algorithms have different notes, add a new callback 'help' to show the differences. Signed-off-by: Yang Shen --- crypto/benchmark/benchmark.c | 79 ++++++++++++++++++++++++++++++++++++ crypto/benchmark/bm_comp.c | 10 +++++ crypto/benchmark/bm_comp.h | 1 + 3 files changed, 90 insertions(+) diff --git a/crypto/benchmark/benchmark.c b/crypto/benchmark/benchmark.c index b5dcf5829b22..a3ccd8955eaa 100644 --- a/crypto/benchmark/benchmark.c +++ b/crypto/benchmark/benchmark.c @@ -32,6 +32,12 @@ struct crypto_bm_alg_ops { int (*create_req)(struct crypto_bm_base *base, u32 idx); void (*release_req)(struct crypto_bm_base *base, u32 idx); int (*perf)(struct crypto_bm_thread_data *data); + void (*help)(void); +}; + +struct crypto_bm_mp_info { + const char *mp; + const char *help_info; }; struct { @@ -51,6 +57,9 @@ struct { #define threadnum_desc "Testing thread number, one 'crypto_tfm' per thread. 0/1 (default 1 thread), 2(2 threads) ..." #define time_desc "Testing time, the unit is second, 0/1 (default 1 s), 2(2 s) ..." #define run_desc "Start/stop all the tests based on the configuration, 0(default, not run, stop), or run" +#define help_desc "Some help information. Echo a module parameter can get the info " \ + "of module parameter. Cat 'help' directly can get the help "\ + "information provided by 'algtype'." static atomic_t benchmark_status; @@ -75,11 +84,47 @@ static struct crypto_bm_alg_ops benchmark_ops[] = { .create_req = crypto_bm_create_req_comp, .release_req = crypto_bm_release_req_comp, .perf = crypto_bm_perf_comp, + .help = crypto_bm_help_comp, }, { /* sentinel */ } }; +static struct crypto_bm_mp_info modules_help[] = { + { + .mp = "algorithm", + .help_info = "Please input a crypto supported algorithm name.\n" + "The algorithm name can be found on /proc/crypto.", + }, { + .mp = "algtype", + .help_info = "Please input a valid value to choose algorithm class.\n" + "0: CRYPTO_BM_COMP", + }, { + .mp = "inputsize", + .help_info = "Please input a valid value as testing input size.", + }, { + .mp = "loop", + .help_info = "Please input the send loop times.", + }, { + .mp = "numamask", + .help_info = "Please input a bitmap as testing numa nodes.", + }, { + .mp = "optype", + .help_info = "Please input a valid value for testing operation.\n" + "Can get the algorithm type support optype by cat 'help'." + }, { + .mp = "reqnum", + .help_info = "Please input a valid value for per thread request number.", + }, { + .mp = "threadnum", + .help_info = "Please input a valid value for creating threads.\n" + "One thread will create a crypto_tfm.", + }, { + .mp = "time", + .help_info = "Please input a valid value for testing time.", + } +}; + static int crypto_bm_algorithm_param_set(const char *val, const struct kernel_param *kp) { char *s = strstrip((char *)val); @@ -103,6 +148,40 @@ static const struct kernel_param_ops alg_ops = { module_param_cb(algorithm, &alg_ops, &benchmark_attrs.algorithm, 0644); MODULE_PARM_DESC(algorithm, algorithm_desc); +static int crypto_bm_help_param_set(const char *val, const struct kernel_param *kp) +{ + int size = ARRAY_SIZE(modules_help); + char *s = strstrip((char *)val); + int i; + + for (i = 0; i < size; i++) { + if (!strcmp(s, modules_help[i].mp)) + pr_err("%s\n", modules_help[i].help_info); + } + + return 0; +} + +static int crypto_bm_help_param_get(char *val, const struct kernel_param *kp) +{ + u32 idx = benchmark_attrs.algtype; + + if (idx >= CRYPTO_BM_ALG_MAX) + return -EINVAL; + + benchmark_ops[idx].help(); + + return 0; +} + +static const struct kernel_param_ops help_ops = { + .set = crypto_bm_help_param_set, + .get = crypto_bm_help_param_get, +}; + +module_param_cb(help, &help_ops, NULL, 0644); +MODULE_PARM_DESC(help, help_desc); + static int crypto_bm_numamask_param_set(const char *val, const struct kernel_param *kp) { if (atomic_read(&benchmark_status)) diff --git a/crypto/benchmark/bm_comp.c b/crypto/benchmark/bm_comp.c index 2772a8e86e2e..62192a55b2ab 100644 --- a/crypto/benchmark/bm_comp.c +++ b/crypto/benchmark/bm_comp.c @@ -423,3 +423,13 @@ int crypto_bm_perf_comp(struct crypto_bm_thread_data *data) return ret; } + +void crypto_bm_help_comp(void) +{ + pr_err("Welcome to use the crypto benchmark to test compress algorithm!\n" + "There ars some different moduel parameters requirement:\n" + "optype: 0 for compression, 1 for decompression\n" + "inputsize: for compression, the inputsize is src_len,\n" + " for decompression, the inputsize is dst_len, and the src_len will depend on the data compression ratio.\n" + ); +} diff --git a/crypto/benchmark/bm_comp.h b/crypto/benchmark/bm_comp.h index 78b45f8b22a6..aedafde2c3ad 100644 --- a/crypto/benchmark/bm_comp.h +++ b/crypto/benchmark/bm_comp.h @@ -14,5 +14,6 @@ void crypto_bm_release_tfm_comp(struct crypto_bm_base *base, u32 idx); int crypto_bm_create_req_comp(struct crypto_bm_base *base, u32 idx); void crypto_bm_release_req_comp(struct crypto_bm_base *base, u32 idx); int crypto_bm_perf_comp(struct crypto_bm_thread_data *data); +void crypto_bm_help_comp(void); #endif From patchwork Mon Sep 19 12:05:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shen X-Patchwork-Id: 608041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92CE1C6FA8B for ; Mon, 19 Sep 2022 12:08:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230089AbiISMIH (ORCPT ); Mon, 19 Sep 2022 08:08:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230019AbiISMIE (ORCPT ); Mon, 19 Sep 2022 08:08:04 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E1B72AC50; Mon, 19 Sep 2022 05:08:02 -0700 (PDT) Received: from dggpemm500023.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MWNc8577mzMn4n; Mon, 19 Sep 2022 20:03:20 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 Received: from localhost.localdomain (10.67.164.66) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 From: Yang Shen To: , CC: , , Subject: [RFC PATCH 5/6] crypto: benchmark - add API documentation Date: Mon, 19 Sep 2022 20:05:36 +0800 Message-ID: <20220919120537.39258-6-shenyang39@huawei.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220919120537.39258-1-shenyang39@huawei.com> References: <20220919120537.39258-1-shenyang39@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Provide a crypto benchmark to help the developer quickly get the performance of a crypto-registed algorithm. To simulate more scenes, the tool has following parameters under '/sys/modules/crypto_benchmark/parameters/' to configure: algorithm, algtype, inputsize, loop, numamask, optype, reqnum, threadnum and time. To shield the differences between different algorithms, the tool has following interface to do a crypto request: init, uninit, create_tfm, release_tfm, create_req, release_req, perf and help. Signed-off-by: Yang Shen --- Documentation/crypto/benchmark.rst | 104 +++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) create mode 100644 Documentation/crypto/benchmark.rst diff --git a/Documentation/crypto/benchmark.rst b/Documentation/crypto/benchmark.rst new file mode 100644 index 000000000000..e9b13e81bce3 --- /dev/null +++ b/Documentation/crypto/benchmark.rst @@ -0,0 +1,104 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Crypto Benchmark +================ + +Overview +-------- +The crypto benchmark is a crypto algorithm performance tool. + +Designed Scheme +--------------- + +1. Parameters + +The crypto benchmark is used for test the algorithm registered in crypto +subsystem. Users can use module parameters to simulate different scenarios. +Both considering the test scenarios and the use complexity, the benchmark +tool has following module parameters: + +- algorithm +The 'algorithm' is used to create a 'crypto_tfm'. The right algorithm name +can be found in /proc/crypto. + +- algtype +The 'algtype' is used to find the operations of algorithm. Can get the +algorithm class by echo 'algtype' to +/sys/module/crypto_benchmark/parameters/help. + +- inputsize +The 'inputsize' is used as testing inputsize, outputsize will be set +according to algorithm. + +- loop +The 'loop' is used as times to try to send request for one 'crypto_req'. +Avoid performance fluctuations caused by environment. +For synchronization mode, the loop times is equal to send times. +But for asynchronization, the send times is often less than loop times. + +- numamask +The 'numamask' is used as testing binding numa nodes. The input will be +analyzed as a bitmap. + +- optype +The 'optype' is used for choose algorithm operation function. Can get the +algorithm available operation types by cat +/sys/module/crypto_benchmark/parameters/help with specified 'algtype'. +For example, choose the compress and decompress when test crypto comp. + +- reqnum +The 'reqnum' is used as requests number of a crypto tfm. For asynchronization, +one thread may used plural 'crypto_req' to improve performance. One request +a thread is a synchronous model + +- threadnum +The 'threadnum' is used for creating testing threads. To simplify model, +create a 'crypto_tfm' per thread. Notice that all threads will be divided +equally to the specified NUMA node, and threads that cannot be divided +equally will be created on the last node. + +- time +The 'time' is used for testing. Used for stop the test thread. If the time +is not enough, the thread will send another group loop times requests. + +- run +The 'run' is used to trigger the test. Echo 0 for stop all test threads, +and others for starting test. + +- help +The 'help' is used to guide users to use the test interface. Echo a module +parameter name to 'help' can get the detailed information. Cat the 'help' +can get some private information according to 'algtype'. + +2. Register + +There are too many differences between crypto algorithms. Therefore, the +crypto benchmark only completes the general work. All the different parts +are put into the callback of the algorithm to complete. The usual crypto +task can be divided into three parts: alloc tfm, alloc request, and send +request. + +A new algorithm class want to register to crypto benchmark should realize +following callbacks: + +- init & uninit +The initialize related functions. Algorithm can do some private setting. + +- create_tfm & release_tfm +The crypto_tfm related functions. Algorithm has different tfm name. +But they both has a member named tfm, so use tfm to stand for algorithm +handle. The benchmark has provides the tfm array. + +- create_req & release_req +The crypto_req related functions. The registrant should create a 'reqnum' +'crypto_req' group in struct 'crypto_bm_base'. And the also suggest +prepare the request data in this function. To simulate real cache and TLB +hit rate, using a big data groups is a good plan. + +- perf +The request sending functions. The registrant should use parameter 'loop' +to send requests repeatly. And update the count in struct +'crypto_bm_thread_data'. + +- help +The algorithm private parameters meaning functions. From patchwork Mon Sep 19 12:05:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shen X-Patchwork-Id: 607429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93378ECAAD3 for ; Mon, 19 Sep 2022 12:08:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230081AbiISMIG (ORCPT ); Mon, 19 Sep 2022 08:08:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230038AbiISMIE (ORCPT ); Mon, 19 Sep 2022 08:08:04 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B20212B268; Mon, 19 Sep 2022 05:08:02 -0700 (PDT) Received: from dggpemm500023.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MWNd40hNGzmVVp; Mon, 19 Sep 2022 20:04:08 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 Received: from localhost.localdomain (10.67.164.66) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 19 Sep 2022 20:08:00 +0800 From: Yang Shen To: , CC: , , Subject: [RFC PATCH 6/6] MAINTAINERS: add crypto benchmark MAINTAINER Date: Mon, 19 Sep 2022 20:05:37 +0800 Message-ID: <20220919120537.39258-7-shenyang39@huawei.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220919120537.39258-1-shenyang39@huawei.com> References: <20220919120537.39258-1-shenyang39@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add the maintainer information for the crypto benchmark. Signed-off-by: Yang Shen --- MAINTAINERS | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 164f67e59e5f..89beaebfab23 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5445,6 +5445,13 @@ F: include/crypto/ F: include/linux/crypto* F: lib/crypto/ +CRYPTO BENCHMARK TOOL +M: Yang Shen +L: linux-crypto@vger.kernel.org +S: Maintained +F: Documentation/crypto/benchmark.rst +F: crypto/benchmark/ + CRYPTOGRAPHIC RANDOM NUMBER GENERATOR M: Neil Horman L: linux-crypto@vger.kernel.org