From patchwork Thu Oct 6 22:31:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 613008 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC197C433F5 for ; Thu, 6 Oct 2022 22:37:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbiJFWhW (ORCPT ); Thu, 6 Oct 2022 18:37:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229849AbiJFWhV (ORCPT ); Thu, 6 Oct 2022 18:37:21 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 044D592593; Thu, 6 Oct 2022 15:37:21 -0700 (PDT) Received: from pps.filterd (m0134422.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296MYusK001006; Thu, 6 Oct 2022 22:37:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=Duxug50hyqlUo3vJQebtcrpCMNmKwa8QfgSYdJzoNuM=; b=UhZyX7MU0w0HqSuDw3L+7n1VfgicJUNVpqT5yfol8RiGbu2SHXmcDOsDE+GDPG17WbB5 +4a0STX/LZgmBfZLHSGwLsNu9Ly3k4rn26owIePqTqDLTzLeKy280HQ3SqOeeAISp0BU v8+83FTDfuARSYr5Wl+e4Y0B/N+BvfwoLvcgKFMwcksgclqIfIWYK93WzR4SW8okQmb/ djJyx4EOfjQzkWH/zotD2XEMRn0A0wLnO5IHlQvYpZZwn0rhOH5Z+NXYQrzqAVnlHf5N oGNwTh9xPV3BMm42dCGu7cWyAvPUhLMX1s0Exp4tH4oUi3YQIfQrrhspV0V6FtI/6R/X nw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27xag0ke-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:37:15 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 65B3313949; Thu, 6 Oct 2022 22:32:15 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 162C6803AB6; Thu, 6 Oct 2022 22:32:15 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 1/7] rcu: correct CONFIG_EXT_RCU_CPU_STALL_TIMEOUT descriptions Date: Thu, 6 Oct 2022 17:31:45 -0500 Message-Id: <20221006223151.22159-2-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 3tntthtH4ugWrwTBrPT1WkqFVy2HJArY X-Proofpoint-GUID: 3tntthtH4ugWrwTBrPT1WkqFVy2HJArY X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 malwarescore=0 bulkscore=0 adultscore=0 priorityscore=1501 mlxlogscore=999 lowpriorityscore=0 spamscore=0 impostorscore=0 suspectscore=0 phishscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Make the descriptions of CONFIG_EXT_RCU_CPU_STALL_TIMEOUT match the code: - there is no longer a default of 20 ms for Android since commit 1045a06724f3 ("remove CONFIG_ANDROID"), - the code includes a maximum of 21 seconds, evident when specifying 0 which means to use the CONFIG_RCU_STALL_TIMEOUT value (whose default is 60 seconds). Example .config: CONFIG_RCU_CPU_STALL_TIMEOUT=60 CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0 leads to: /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout:60 /sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout:21000 Fixes: 1045a06724f3 ("remove CONFIG_ANDROID") Signed-off-by: Robert Elliott --- Documentation/RCU/stallwarn.rst | 9 +++++---- kernel/rcu/Kconfig.debug | 2 +- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index e38c587067fc..d86a8b47504f 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -168,10 +168,11 @@ CONFIG_RCU_EXP_CPU_STALL_TIMEOUT Same as the CONFIG_RCU_CPU_STALL_TIMEOUT parameter but only for the expedited grace period. This parameter defines the period of time that RCU will wait from the beginning of an expedited - grace period until it issues an RCU CPU stall warning. This time - period is normally 20 milliseconds on Android devices. A zero - value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be used, - after conversion to milliseconds. + grace period until it issues an RCU CPU stall warning. + + A zero value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be + used, after conversion to milliseconds, limited to a maximum of + 21 seconds. This configuration parameter may be changed at runtime via the /sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout, however diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug index 1b0c41d490f0..4477eeb8a54f 100644 --- a/kernel/rcu/Kconfig.debug +++ b/kernel/rcu/Kconfig.debug @@ -93,7 +93,7 @@ config RCU_EXP_CPU_STALL_TIMEOUT If the RCU grace period persists, additional CPU stall warnings are printed at more widely spaced intervals. A value of zero says to use the RCU_CPU_STALL_TIMEOUT value converted from - seconds to milliseconds. + seconds to milliseconds, limited to a maximum of 21 seconds. config RCU_TRACE bool "Enable tracing for RCU" From patchwork Thu Oct 6 22:31:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 613011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90CDFC433FE for ; Thu, 6 Oct 2022 22:32:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231416AbiJFWc2 (ORCPT ); Thu, 6 Oct 2022 18:32:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231819AbiJFWc0 (ORCPT ); Thu, 6 Oct 2022 18:32:26 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 405C9F191A; Thu, 6 Oct 2022 15:32:25 -0700 (PDT) Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296LnvEX028642; Thu, 6 Oct 2022 22:32:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=zWgczEyt8JM5uP71whdVulbYeWTSWmnUfLLvmKY1IRc=; b=Qpowa9nqBZ6KWbEyNYz7R+65NLNG1ybfYzpwNUHYcZB8q4PujF+KKqQqV5kIz8RYO0SW JLkZp4KVEwJOp5iw5g72ZtqyHoXvqdFzXZi99ULefnhhkXR5FwqL1aIcAn5hmAJUKHWU VLcD63toWsaB+SlgIULN9tYmbdhmDogunPUGMt+XvSZ1ohADk7RiFAFJqws1JvggVyTO Vkepmx/WD16jc36F53u3MakFphd8G0D/tAj/EwoEOXwiK/Q10A1LoMiRzyH6995xb5ss bwTprmxnvAUK6j9giWxd4lMPOOhjKtG19AqFY+m624guWb8FbcwAND7fNrBWTX6ZsTHp WQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27950a0m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:22 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 7F9DAB2; Thu, 6 Oct 2022 22:32:21 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 34D6D8065D2; Thu, 6 Oct 2022 22:32:21 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 2/7] crypto: x86/sha - limit FPU preemption Date: Thu, 6 Oct 2022 17:31:46 -0500 Message-Id: <20221006223151.22159-3-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 9S8TRWttWHfcsOzFi-PSwOQ09LiawGst X-Proofpoint-GUID: 9S8TRWttWHfcsOzFi-PSwOQ09LiawGst X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 impostorscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060132 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running. This leads to "rcu_preempt detected expedited stalls" with stack dumps pointing to the optimized hash function if this module is loaded and used a lot: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {12-... } 22 jiffies s: 277 root: 0x1/. For example, that can occur during boot with the stack track pointing to the sha512-x86 function if the system set to use SHA-512 for module signing. The call trace includes: module_sig_check mod_verify_sig pkcs7_verify pkcs7_digest sha512_finup sha512_base_do_update Fixes: 66be89515888 ("crypto: sha1 - SSSE3 based SHA1 implementation for x86-64") Fixes: 8275d1aa6422 ("crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.") Fixes: 87de4579f92d ("crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.") Fixes: aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") Suggested-by: Herbert Xu Reviewed-by: Tim Chen Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 34 +++++++++++++++++++++++----- arch/x86/crypto/sha256_ssse3_glue.c | 35 ++++++++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 35 ++++++++++++++++++++++++----- 3 files changed, 89 insertions(+), 15 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 4430463dee62..033812989476 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -27,10 +27,13 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha1_block_fn *sha1_xform) { struct sha1_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE) @@ -42,9 +45,18 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0); - kernel_fpu_begin(); - sha1_base_do_update(desc, data, len, sha1_xform); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); return 0; } @@ -52,12 +64,24 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, static int sha1_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out, sha1_block_fn *sha1_xform) { + unsigned int chunk; + if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha1_base_do_update(desc, data, len, sha1_xform); sha1_base_do_finalize(desc, sha1_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index e437fba0299b..99a25c238f40 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -40,6 +40,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); @@ -47,6 +49,7 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha256_block_fn *sha256_xform) { struct sha256_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) @@ -58,9 +61,18 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); - kernel_fpu_begin(); - sha256_base_do_update(desc, data, len, sha256_xform); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); return 0; } @@ -68,12 +80,25 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, static int sha256_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out, sha256_block_fn *sha256_xform) { + unsigned int chunk; + if (!crypto_simd_usable()) return crypto_sha256_finup(desc, data, len, out); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha256_base_do_update(desc, data, len, sha256_xform); sha256_base_do_finalize(desc, sha256_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 3c19f803f288..72eee03448dc 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -39,6 +39,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); @@ -46,6 +48,7 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha512_block_fn *sha512_xform) { struct sha512_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count[0] % SHA512_BLOCK_SIZE) + len < SHA512_BLOCK_SIZE) @@ -57,9 +60,18 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); - kernel_fpu_begin(); - sha512_base_do_update(desc, data, len, sha512_xform); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); return 0; } @@ -67,12 +79,25 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, static int sha512_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out, sha512_block_fn *sha512_xform) { + unsigned int chunk; + if (!crypto_simd_usable()) return crypto_sha512_finup(desc, data, len, out); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha512_base_do_update(desc, data, len, sha512_xform); sha512_base_do_finalize(desc, sha512_xform); kernel_fpu_end(); From patchwork Thu Oct 6 22:31:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 613010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6590DC433FE for ; Thu, 6 Oct 2022 22:32:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232088AbiJFWcd (ORCPT ); Thu, 6 Oct 2022 18:32:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231997AbiJFWcb (ORCPT ); Thu, 6 Oct 2022 18:32:31 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5188F1927; Thu, 6 Oct 2022 15:32:30 -0700 (PDT) Received: from pps.filterd (m0134421.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296LdZOq027880; Thu, 6 Oct 2022 22:32:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=B+xhqzalsDHwm/ksRSO5kdmSiCCi1Sgu2dmiQdGIOpU=; b=Okw2brtWKNBoiaErRF1ACg4hUbP6nC6nKJl2/z/mzAVYMPFnkIUE5rLmWAdMFP7upOoQ RO4Y2MYD7yXuYfeRVu8+MrU4V77ZP5wYEbLbyjH1bCA3Q1dDtDHuq8XtIdpDPA556mr2 uE3j40VBR1YeEb8Dj4iHA4tWGQDrQtzbeajX5NqwmrCMV0KZ4I1YvwPN6sU0sbSkejG/ LsvQQv6Ok/pavLgp8o1UoEvQWk+kW2U0iVosOjz9RbnLnLvjAubuZQhkLC8DDrd7SyPL Bxv8udBYQPMHtbpu4a5tKUIdJ+5tEhh5cKQx8iJxpDC5h5B6X3tZRX3RHXJ5r3fR7MHl wg== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k274bgdda-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:27 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 9A2C4801AD4; Thu, 6 Oct 2022 22:32:27 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 495238065D2; Thu, 6 Oct 2022 22:32:27 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 5/7] crypto: x86/ghash - restructure FPU context saving Date: Thu, 6 Oct 2022 17:31:49 -0500 Message-Id: <20221006223151.22159-6-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 7MOIYjqP-Hhb-6ROWjn498ByW2wkiC_9 X-Proofpoint-ORIG-GUID: 7MOIYjqP-Hhb-6ROWjn498ByW2wkiC_9 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 spamscore=0 phishscore=0 bulkscore=0 impostorscore=0 mlxscore=0 mlxlogscore=999 priorityscore=1501 suspectscore=0 lowpriorityscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060132 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 3a96c167d78d..b25730c5b267 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -82,7 +82,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -93,10 +92,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end(); From patchwork Thu Oct 6 22:31:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 613009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 131B1C433F5 for ; Thu, 6 Oct 2022 22:33:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232312AbiJFWdF (ORCPT ); Thu, 6 Oct 2022 18:33:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231644AbiJFWcv (ORCPT ); Thu, 6 Oct 2022 18:32:51 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5360C183EE9; Thu, 6 Oct 2022 15:32:40 -0700 (PDT) Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296LnmM3028576; Thu, 6 Oct 2022 22:32:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ndNOQqnDYieYfKjG+Qoea1ejqGoDU1XlffrmRRYeio0=; b=H08juEmk79ucPnqWUTDRL+jxWw+p8pbC5dZoMApglOz3bRlvILd7PeL4f4a/gwy8HTIi EPzT2fnKFXl6lIsXBOwuZrVX55MRqk+DVXFawnMTI5wfckYwN0WhV+vd1Ek+4X5Gnc3T oV/dFApBfnOPvUU7//6JX62JqaTaB14zrzSgj3xwwezEjFDbyGSCPVEziTBtMoh4nTq2 BP054m5ckuGHpoAHMWWjdx7dw9kR9SvMPLrepktm5E+6+tqPUswcRrmnAGXsD82f+ZqP FtWz3KHIvDBR/7dNYwuJ5+uDGYHLqndXJQTH5UuC36Objm7RJQVSzsF5QKIpB9WIIbSG wA== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27950a2e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:36 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id EA73D806B41; Thu, 6 Oct 2022 22:32:30 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id A2560807F19; Thu, 6 Oct 2022 22:32:30 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 7/7] crypto: x86 - use common macro for FPU limit Date: Thu, 6 Oct 2022 17:31:51 -0500 Message-Id: <20221006223151.22159-8-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: s2EqSClVPHeR83IWeMZ0ZM2Rxs6EgdAQ X-Proofpoint-GUID: s2EqSClVPHeR83IWeMZ0ZM2Rxs6EgdAQ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 impostorscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Use a common macro name (FPU_BYTES) for the limit of the number of bytes processed within kernel_fpu_begin and kernel_fpu_end rather than using SZ_4K (which is a signed value), or a magic value of 4096U. Use unsigned int rather than size_t for some of the arguments to avoid typecasting for the min() macro. Signed-off-by: Robert Elliott --- arch/x86/crypto/blake2s-glue.c | 7 ++++--- arch/x86/crypto/chacha_glue.c | 4 +++- arch/x86/crypto/nhpoly1305-avx2-glue.c | 3 ++- arch/x86/crypto/nhpoly1305-sse2-glue.c | 4 +++- arch/x86/crypto/poly1305_glue.c | 25 ++++++++++++++----------- arch/x86/crypto/polyval-clmulni_glue.c | 5 +++-- 6 files changed, 29 insertions(+), 19 deletions(-) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index a88522e4d0f8..02b72d29dc9b 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -18,6 +18,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, const u8 *block, const size_t nblocks, const u32 inc); @@ -31,8 +33,7 @@ static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); void blake2s_compress(struct blake2s_state *state, const u8 *block, size_t nblocks, const u32 inc) { - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); + BUILD_BUG_ON(FPU_BYTES / BLAKE2S_BLOCK_SIZE < 8); if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { blake2s_compress_generic(state, block, nblocks, inc); @@ -41,7 +42,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, do { const size_t blocks = min_t(size_t, nblocks, - SZ_4K / BLAKE2S_BLOCK_SIZE); + FPU_BYTES / BLAKE2S_BLOCK_SIZE); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index feb53e90f0e3..40ddd0ce50d6 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -18,6 +18,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, unsigned int len, int nrounds); asmlinkage void chacha_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, @@ -150,7 +152,7 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, return chacha_crypt_generic(state, dst, src, bytes, nrounds); do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + unsigned int todo = min(bytes, FPU_BYTES); kernel_fpu_begin(); chacha_dosimd(state, dst, src, todo, nrounds); diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 68cf24213e1c..7e65ccd86f75 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -16,6 +16,7 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -34,7 +35,7 @@ static int nhpoly1305_avx2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2); diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index 75c324253b37..4f35b52e21f0 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -16,6 +16,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -33,7 +35,7 @@ static int nhpoly1305_sse2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2); diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 59d0b01b3389..c036315dbd39 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -18,20 +18,24 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void poly1305_init_x86_64(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]); asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); -asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); @@ -89,14 +93,13 @@ static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]) poly1305_init_x86_64(ctx, key); } -static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, +static void poly1305_simd_blocks(void *ctx, const u8 *inp, unsigned int len, const u32 padbit) { struct poly1305_arch_internal *state = ctx; - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE || - SZ_4K % POLY1305_BLOCK_SIZE); + BUILD_BUG_ON(FPU_BYTES < POLY1305_BLOCK_SIZE || + FPU_BYTES % POLY1305_BLOCK_SIZE); if (!static_branch_likely(&poly1305_use_avx) || (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) || @@ -107,7 +110,7 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, } do { - const size_t bytes = min_t(size_t, len, SZ_4K); + const unsigned int bytes = min(len, FPU_BYTES); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index b7664d018851..2502964afef6 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -29,6 +29,8 @@ #define NUM_KEY_POWERS 8 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + struct polyval_tfm_ctx { /* * These powers must be in the order h^8, ..., h^1. @@ -123,8 +125,7 @@ static int polyval_x86_update(struct shash_desc *desc, } while (srclen >= POLYVAL_BLOCK_SIZE) { - /* Allow rescheduling every 4K bytes. */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + nblocks = min(srclen, FPU_BYTES) / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; From patchwork Wed Oct 12 21:59:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BABAC4332F for ; Wed, 12 Oct 2022 22:01:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230086AbiJLWBg (ORCPT ); Wed, 12 Oct 2022 18:01:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230000AbiJLWAc (ORCPT ); Wed, 12 Oct 2022 18:00:32 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C3AE4D4EC; Wed, 12 Oct 2022 15:00:15 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL8Ert016464; Wed, 12 Oct 2022 22:00:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=bap2tNZolRwbU8NuMuLdrUdY6VmdGzhFwpmbpTsze4k=; b=bMqiADQ4M2zeZi+iNQEQd7DsQIboatGHMck+sBlj2TK8XhtdBeaau7ek5GrtMar0Tpaa RB3T0q3TwQwDelJC5xiBxFv+OGDNgMjTlAfzR4W08HiF8DYrPxBoMSS/shqcV2jdeNLK OKN/rVXlguzEOYKj8rzmmCK7L0oqUzsRhvTQnaf2AV8/dVwhu6AuXWmchPM7OtgGYuKe LeZorWzqpNCYeW+GknksbJZ5TzBftFPAESOMSAPCEdqEbEXJ2ugyqZCZuzEAZr533+6Z Gaiy/+X5iFJc6h/2NLgnhFEKUMNeeQ+j+Acj+ikcKDKIalzUcIfHyvm20DT1g/YmWI8M CA== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a5k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:03 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 6E5FD13964; Wed, 12 Oct 2022 22:00:02 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 0FB92805032; Wed, 12 Oct 2022 22:00:02 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 12/19] crypto: x86/sm3 - load based on CPU features Date: Wed, 12 Oct 2022 16:59:24 -0500 Message-Id: <20221012215931.3896-13-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: qJn0nosD52Bi6Md1qaiHmCNMvlHY8Xfb X-Proofpoint-GUID: qJn0nosD52Bi6Md1qaiHmCNMvlHY8Xfb X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sm3 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. This commit covers a module that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/sm3_avx_glue.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index ffb6d2f409ef..475b9637a06d 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -115,10 +116,19 @@ static struct shash_alg sm3_avx_alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sm3_avx_mod_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX)) { pr_info("AVX instruction are not detected.\n"); return -ENODEV; From patchwork Wed Oct 12 21:59:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8378CC4332F for ; Wed, 12 Oct 2022 22:01:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230037AbiJLWBQ (ORCPT ); Wed, 12 Oct 2022 18:01:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229989AbiJLWAa (ORCPT ); Wed, 12 Oct 2022 18:00:30 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73F8C4DF36; Wed, 12 Oct 2022 15:00:16 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CLWO6L026058; Wed, 12 Oct 2022 22:00:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=o8FuU3Q+uLfzqaV74fBLQi/WP+5rrgR7yOITBPlHv9o=; b=L81HtriDA4ZqbzWE5F/tsLihXXg4BeQt5qvnjLtJXuS1UNmP+ulHHdSDC2awfQ/CLUf5 o5xSjQbOULwBd+N9+CE2+OibN8bbtyIgWut6F3py4AdA5oztgtJFfl1APsiClotthErQ wY+gZtpOUoAR+ftNeksN8Rddo9sWCLNVTqOgaM1hSo5cyG+hWAhUYgdk6zLEx2Oh9Y6l NKx9uYJzrB4P2M0YMtKtRXvQXpTrlRPM7BOFssN23S6bwMFeSpy365LhEzpy/IakgeFV iGnBjRUmES2QNsGs83FQ9V9lnKTR/T19XqFWKLAD2G8bfeZ/uU+eQ7Eed+epc3Knsq5c HQ== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8abj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:05 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 4299213965; Wed, 12 Oct 2022 22:00:04 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id D23D28038C2; Wed, 12 Oct 2022 22:00:03 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 13/19] crypto: x86/ghash - load based on CPU features Date: Wed, 12 Oct 2022 16:59:25 -0500 Message-Id: <20221012215931.3896-14-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 5USlLj6klF6wOtLmvXZJ3Frf8jg5Io-U X-Proofpoint-GUID: 5USlLj6klF6wOtLmvXZJ3Frf8jg5Io-U X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: ghash Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. This commit covers modules that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index a39fc405c7cf..69945e41bc41 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -327,17 +327,17 @@ static struct ahash_alg ghash_async_alg = { }, }; -static const struct x86_cpu_id pcmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), /* Pickle-Mickle-Duck */ {} }; -MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init ghash_pclmulqdqni_mod_init(void) { int err; - if (!x86_match_cpu(pcmul_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; err = crypto_register_shash(&ghash_alg); From patchwork Wed Oct 12 21:59:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCEA4C433FE for ; Wed, 12 Oct 2022 22:02:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230085AbiJLWCZ (ORCPT ); Wed, 12 Oct 2022 18:02:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229693AbiJLWBT (ORCPT ); Wed, 12 Oct 2022 18:01:19 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 435E275FFB; Wed, 12 Oct 2022 15:00:22 -0700 (PDT) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKWT9v015442; Wed, 12 Oct 2022 22:00:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ntW9Ji65PFoN+eegclvP9j8XWOomEfeAptNcfNSx+Ww=; b=c2Af7gH+FDARfJhl2E3aelLxoxX8RDfySYFsKUjxQ67YSpfZRxYlw4YkQQdalIkINgq7 JjfBwQ+NUfhWUJWfTC7XL44LYuSYoqX+qy3QZ2Nt+lDRsS+oiL+eh9QYkkfjDuY94VZy oMlS6m1SdydQmyvQWHEj4p4quyKOzNrh+FhEBCyHkN1hK9sOS7oGDWGfwaafz2xPE2PB qjVouTBQEaYmSG7t6xIpP7VjSAPRqZzXlZOLbbhJ57xNAhm5pY4nVnI6DsufWx4A3DFL d5Y3kiBfBKCB618kXWqVTkyB2c/VqJ+bXCpUwYBGKHwEP0Wzuo5XTE+ORggCoZhokaMm +Q== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k615t2esc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:09 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 4962E29582; Wed, 12 Oct 2022 22:00:08 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id E2DA18003BA; Wed, 12 Oct 2022 22:00:07 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 16/19] crypto: x86 - print CPU optimized loaded messages Date: Wed, 12 Oct 2022 16:59:28 -0500 Message-Id: <20221012215931.3896-17-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: _Qi2zLBrbtmT0s8gvCC9eeuThCRMqzid X-Proofpoint-ORIG-GUID: _Qi2zLBrbtmT0s8gvCC9eeuThCRMqzid X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=999 lowpriorityscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Print a positive message at the info level if the CPU-optimized module is loaded, for all modules except the sha modules. Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 8 +++++-- arch/x86/crypto/aesni-intel_glue.c | 22 +++++++++++++------ arch/x86/crypto/aria_aesni_avx_glue.c | 13 ++++++++--- arch/x86/crypto/blake2s-glue.c | 14 ++++++++++-- arch/x86/crypto/blowfish_glue.c | 2 ++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 6 +++++- arch/x86/crypto/camellia_aesni_avx_glue.c | 6 +++++- arch/x86/crypto/camellia_glue.c | 3 +++ arch/x86/crypto/cast5_avx_glue.c | 6 +++++- arch/x86/crypto/cast6_avx_glue.c | 6 +++++- arch/x86/crypto/chacha_glue.c | 17 +++++++++++++-- arch/x86/crypto/crc32-pclmul_glue.c | 8 ++++++- arch/x86/crypto/crc32c-intel_glue.c | 15 +++++++++++-- arch/x86/crypto/crct10dif-pclmul_glue.c | 7 +++++- arch/x86/crypto/curve25519-x86_64.c | 13 +++++++++-- arch/x86/crypto/des3_ede_glue.c | 2 ++ arch/x86/crypto/ghash-clmulni-intel_glue.c | 1 + arch/x86/crypto/nhpoly1305-avx2-glue.c | 7 +++++- arch/x86/crypto/nhpoly1305-sse2-glue.c | 7 +++++- arch/x86/crypto/poly1305_glue.c | 25 ++++++++++++++++++---- arch/x86/crypto/polyval-clmulni_glue.c | 7 +++++- arch/x86/crypto/serpent_avx2_glue.c | 7 ++++-- arch/x86/crypto/serpent_avx_glue.c | 6 +++++- arch/x86/crypto/serpent_sse2_glue.c | 7 +++++- arch/x86/crypto/sm3_avx_glue.c | 6 +++++- arch/x86/crypto/sm4_aesni_avx2_glue.c | 6 +++++- arch/x86/crypto/sm4_aesni_avx_glue.c | 7 ++++-- arch/x86/crypto/twofish_avx_glue.c | 10 ++++++--- arch/x86/crypto/twofish_glue.c | 7 +++++- arch/x86/crypto/twofish_glue_3way.c | 9 ++++++-- 30 files changed, 213 insertions(+), 47 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 122bfd04ee47..e8eaf79ef220 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -275,7 +275,8 @@ static struct simd_aead_alg *simd_alg; static int __init crypto_aegis128_aesni_module_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + int ret; + return -ENODEV; if (!boot_cpu_has(X86_FEATURE_XMM2) || @@ -283,8 +284,11 @@ static int __init crypto_aegis128_aesni_module_init(void) !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) return -ENODEV; - return simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, + ret = simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, &simd_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit crypto_aegis128_aesni_module_exit(void) diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index df93cb44b4eb..56023ba70049 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -1238,25 +1238,28 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init aesni_init(void) { int err; + int enabled_gcm_sse = 0; + int enabled_gcm_avx = 0; + int enabled_gcm_avx2 = 0; + int enabled_ctr_avx = 0; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_AVX2)) { - pr_info("AVX2 version of gcm_enc/dec engaged.\n"); + enabled_gcm_avx = 1; + enabled_gcm_avx2 = 1; static_branch_enable(&gcm_use_avx); static_branch_enable(&gcm_use_avx2); - } else - if (boot_cpu_has(X86_FEATURE_AVX)) { - pr_info("AVX version of gcm_enc/dec engaged.\n"); + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + enabled_gcm_avx = 1; static_branch_enable(&gcm_use_avx); } else { - pr_info("SSE version of gcm_enc/dec engaged.\n"); + enabled_gcm_sse = 1; } if (boot_cpu_has(X86_FEATURE_AVX)) { - /* optimize performance of ctr mode encryption transform */ + enabled_ctr_avx = 1; static_call_update(aesni_ctr_enc_tfm, aesni_ctr_enc_avx_tfm); - pr_info("AES CTR mode by8 optimization enabled\n"); } #endif /* CONFIG_X86_64 */ @@ -1283,6 +1286,11 @@ static int __init aesni_init(void) goto unregister_aeads; #endif /* CONFIG_X86_64 */ + pr_info("CPU-optimized crypto module loaded (GCM SSE=%s, AVX=%s, AVX2=%s)(CTR AVX=%s)\n", + enabled_gcm_sse ? "yes" : "no", + enabled_gcm_avx ? "yes" : "no", + enabled_gcm_avx2 ? "yes" : "no", + enabled_ctr_avx ? "yes" : "no"); return 0; #ifdef CONFIG_X86_64 diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index 589097728bd1..d58fb995a266 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -170,6 +170,8 @@ static struct simd_skcipher_alg *aria_simd_algs[ARRAY_SIZE(aria_algs)]; static int __init aria_avx_init(void) { const char *feature_name; + int ret; + int enabled_gfni = 0; if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || @@ -188,15 +190,20 @@ static int __init aria_avx_init(void) aria_ops.aria_encrypt_16way = aria_aesni_avx_gfni_encrypt_16way; aria_ops.aria_decrypt_16way = aria_aesni_avx_gfni_decrypt_16way; aria_ops.aria_ctr_crypt_16way = aria_aesni_avx_gfni_ctr_crypt_16way; + enabled_gfni = 1; } else { aria_ops.aria_encrypt_16way = aria_aesni_avx_encrypt_16way; aria_ops.aria_decrypt_16way = aria_aesni_avx_decrypt_16way; aria_ops.aria_ctr_crypt_16way = aria_aesni_avx_ctr_crypt_16way; } - return simd_register_skciphers_compat(aria_algs, - ARRAY_SIZE(aria_algs), - aria_simd_algs); + ret = simd_register_skciphers_compat(aria_algs, + ARRAY_SIZE(aria_algs), + aria_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded (GFNI=%s)\n", + enabled_gfni ? "yes" : "no"); + return ret; } static void __exit aria_avx_exit(void) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index ac7fb7a9922b..4f2f385f6674 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -66,11 +66,16 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init blake2s_mod_init(void) { + int enabled_ssse3 = 0; + int enabled_avx512 = 0; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (boot_cpu_has(X86_FEATURE_SSSE3)) { + enabled_ssse3 = 1; static_branch_enable(&blake2s_use_ssse3); + } if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX) && @@ -78,9 +83,14 @@ static int __init blake2s_mod_init(void) boot_cpu_has(X86_FEATURE_AVX512F) && boot_cpu_has(X86_FEATURE_AVX512VL) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | - XFEATURE_MASK_AVX512, NULL)) + XFEATURE_MASK_AVX512, NULL)) { + enabled_avx512 = 1; static_branch_enable(&blake2s_use_avx512); + } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX512=%s)\n", + enabled_ssse3 ? "yes" : "no", + enabled_avx512 ? "yes" : "no"); return 0; } diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 5cfcbb91c4ca..27b7aed9a488 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -336,6 +336,8 @@ static int __init blowfish_init(void) if (err) crypto_unregister_alg(&bf_cipher_alg); + if (!err) + pr_info("CPU-optimized crypto module loaded\n"); return err; } diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index 851f2a29963c..e6c4ed1e40d2 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -114,6 +114,7 @@ static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -132,9 +133,12 @@ static int __init camellia_aesni_init(void) return -ENODEV; } - return simd_register_skciphers_compat(camellia_algs, + ret = simd_register_skciphers_compat(camellia_algs, ARRAY_SIZE(camellia_algs), camellia_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit camellia_aesni_fini(void) diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 8846493c92fb..6a9eadf0fe90 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -113,6 +113,7 @@ static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -130,9 +131,12 @@ static int __init camellia_aesni_init(void) return -ENODEV; } - return simd_register_skciphers_compat(camellia_algs, + ret = simd_register_skciphers_compat(camellia_algs, ARRAY_SIZE(camellia_algs), camellia_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit camellia_aesni_fini(void) diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index 3c14a904af00..94dd2973bb47 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -1410,6 +1410,9 @@ static int __init camellia_init(void) if (err) crypto_unregister_alg(&camellia_cipher_alg); + if (!err) + pr_info("CPU-optimized crypto module loaded\n"); + return err; } diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index fdeec0849ab5..b5ae17c3ac53 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -107,6 +107,7 @@ static struct simd_skcipher_alg *cast5_simd_algs[ARRAY_SIZE(cast5_algs)]; static int __init cast5_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -117,9 +118,12 @@ static int __init cast5_init(void) return -ENODEV; } - return simd_register_skciphers_compat(cast5_algs, + ret = simd_register_skciphers_compat(cast5_algs, ARRAY_SIZE(cast5_algs), cast5_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit cast5_exit(void) diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index 9258082408eb..d1c14a5f80d7 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -107,6 +107,7 @@ static struct simd_skcipher_alg *cast6_simd_algs[ARRAY_SIZE(cast6_algs)]; static int __init cast6_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -117,9 +118,12 @@ static int __init cast6_init(void) return -ENODEV; } - return simd_register_skciphers_compat(cast6_algs, + ret = simd_register_skciphers_compat(cast6_algs, ARRAY_SIZE(cast6_algs), cast6_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit cast6_exit(void) diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 8e5cadc808b4..de424fbe9f0e 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -289,6 +289,9 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init chacha_simd_mod_init(void) { + int ret; + int enabled_avx2 = 0; + int enabled_avx512 = 0; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -298,15 +301,25 @@ static int __init chacha_simd_mod_init(void) if (boot_cpu_has(X86_FEATURE_AVX) && boot_cpu_has(X86_FEATURE_AVX2) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + enabled_avx2 = 1; static_branch_enable(&chacha_use_avx2); if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX512VL) && - boot_cpu_has(X86_FEATURE_AVX512BW)) /* kmovq */ + boot_cpu_has(X86_FEATURE_AVX512BW)) { /* kmovq */ + enabled_avx512 = 1; static_branch_enable(&chacha_use_avx512vl); + } } - return IS_REACHABLE(CONFIG_CRYPTO_SKCIPHER) ? + ret = IS_REACHABLE(CONFIG_CRYPTO_SKCIPHER) ? crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0; + if (!ret) + pr_info("CPU-optimized crypto module loaded (AVX2=%s, AVX512=%s)\n", + enabled_avx2 ? "yes" : "no", + enabled_avx512 ? "yes" : "no"); + else + pr_info("CPU-optimized crypto module not loaded"); + return ret; } static void __exit chacha_simd_mod_fini(void) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index bc2b31b04e05..c56d3d3ab0a0 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -190,9 +190,15 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32_pclmul_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - return crypto_register_shash(&alg); + + ret = crypto_register_shash(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit crc32_pclmul_mod_fini(void) diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index ebf530934a3e..c633d303f19b 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -242,16 +242,27 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32c_intel_mod_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + int ret; + int pcl_enabled = 0; + + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU feature (SSE4.2) not supported\n"); return -ENODEV; + } + #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) { + pcl_enabled = 1; alg.update = crc32c_pcl_intel_update; alg.finup = crc32c_pcl_intel_finup; alg.digest = crc32c_pcl_intel_digest; } #endif - return crypto_register_shash(&alg); + ret = crypto_register_shash(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded (PCLMULQDQ=%s)\n", + pcl_enabled ? "yes" : "no"); + return ret; } static void __exit crc32c_intel_mod_fini(void) diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 03e35a1b7677..4476b9af1e61 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -146,10 +146,15 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crct10dif_intel_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - return crypto_register_shash(&alg); + ret = crypto_register_shash(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit crct10dif_intel_mod_fini(void) diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index f9a1adb0c183..b9289feef375 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -1709,15 +1709,24 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init curve25519_mod_init(void) { + int ret; + int enabled_adx = 0; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) + if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) { + enabled_adx = 1; static_branch_enable(&curve25519_use_bmi2_adx); + } else return 0; - return IS_REACHABLE(CONFIG_CRYPTO_KPP) ? + ret = IS_REACHABLE(CONFIG_CRYPTO_KPP) ? crypto_register_kpp(&curve25519_alg) : 0; + if (!ret) + pr_info("CPU-optimized crypto module loaded (ADX=%s)\n", + enabled_adx ? "yes" : "no"); + return ret; } static void __exit curve25519_mod_exit(void) diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index 83e686a6c2f3..7b4dd02007ed 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -384,6 +384,8 @@ static int __init des3_ede_x86_init(void) if (err) crypto_unregister_alg(&des3_ede_cipher); + if (!err) + pr_info("CPU-optimized crypto module loaded\n"); return err; } diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 3ad55144da48..496a410eaff7 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -349,6 +349,7 @@ static int __init ghash_pclmulqdqni_mod_init(void) if (err) goto err_shash; + pr_info("CPU-optimized crypto module loaded\n"); return 0; err_shash: diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 40f49107e5a9..2dc7b618771f 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -68,6 +68,8 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init nhpoly1305_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -75,7 +77,10 @@ static int __init nhpoly1305_mod_init(void) !boot_cpu_has(X86_FEATURE_OSXSAVE)) return -ENODEV; - return crypto_register_shash(&nhpoly1305_alg); + ret = crypto_register_shash(&nhpoly1305_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit nhpoly1305_mod_exit(void) diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index bb40fed92c92..bf0f8ac7afd6 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -68,13 +68,18 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init nhpoly1305_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_XMM2)) return -ENODEV; - return crypto_register_shash(&nhpoly1305_alg); + ret = crypto_register_shash(&nhpoly1305_alg); + if (ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit nhpoly1305_mod_exit(void) diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index a2a7cb39cdec..c9ebb6b90d1f 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -273,22 +273,39 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init poly1305_simd_mod_init(void) { + int ret; + int enabled_avx = 0; + int enabled_avx2 = 0; + int enabled_avx512 = 0; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (boot_cpu_has(X86_FEATURE_AVX) && - cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + enabled_avx = 1; static_branch_enable(&poly1305_use_avx); + } if (boot_cpu_has(X86_FEATURE_AVX) && boot_cpu_has(X86_FEATURE_AVX2) && - cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + enabled_avx2 = 1; static_branch_enable(&poly1305_use_avx2); + } if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX) && boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX512F) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | XFEATURE_MASK_AVX512, NULL) && /* Skylake downclocks unacceptably much when using zmm, but later generations are fast. */ - boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X) + boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X) { + enabled_avx512 = 1; static_branch_enable(&poly1305_use_avx512); - return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? crypto_register_shash(&alg) : 0; + } + ret = IS_REACHABLE(CONFIG_CRYPTO_HASH) ? crypto_register_shash(&alg) : 0; + if (!ret) + pr_info("CPU-optimized crypto module loaded (AVX=%s, AVX2=%s, AVX512=%s)\n", + enabled_avx ? "yes" : "no", + enabled_avx2 ? "yes" : "no", + enabled_avx512 ? "yes" : "no"); + return ret; } static void __exit poly1305_simd_mod_exit(void) diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index 5a345db20ca9..7a3a80085c90 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -183,13 +183,18 @@ MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); static int __init polyval_clmulni_mod_init(void) { + int ret; + if (!x86_match_cpu(pcmul_cpu_id)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX)) return -ENODEV; - return crypto_register_shash(&polyval_alg); + ret = crypto_register_shash(&polyval_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit polyval_clmulni_mod_exit(void) diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 5944bf5ead2e..bf59addaf804 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -108,8 +108,8 @@ static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_avx2_init(void) { const char *feature_name; + int ret; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { @@ -122,9 +122,12 @@ static int __init serpent_avx2_init(void) return -ENODEV; } - return simd_register_skciphers_compat(serpent_algs, + ret = simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), serpent_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit serpent_avx2_fini(void) diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 45713c7a4cb9..7b0c02a61552 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -114,6 +114,7 @@ static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -124,9 +125,12 @@ static int __init serpent_init(void) return -ENODEV; } - return simd_register_skciphers_compat(serpent_algs, + ret = simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), serpent_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit serpent_exit(void) diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index d8aa0d3fbf15..f82880ef6f10 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -116,6 +116,8 @@ static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_sse2_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -124,9 +126,12 @@ static int __init serpent_sse2_init(void) return -ENODEV; } - return simd_register_skciphers_compat(serpent_algs, + ret = simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), serpent_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit serpent_sse2_exit(void) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 475b9637a06d..532f07b05745 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -125,6 +125,7 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init sm3_avx_mod_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -145,7 +146,10 @@ static int __init sm3_avx_mod_init(void) return -ENODEV; } - return crypto_register_shash(&sm3_avx_alg); + ret = crypto_register_shash(&sm3_avx_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit sm3_avx_mod_exit(void) diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 3fe9e170b880..42819ee5d36d 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -143,6 +143,7 @@ simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)]; static int __init sm4_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -161,9 +162,12 @@ static int __init sm4_init(void) return -ENODEV; } - return simd_register_skciphers_compat(sm4_aesni_avx2_skciphers, + ret = simd_register_skciphers_compat(sm4_aesni_avx2_skciphers, ARRAY_SIZE(sm4_aesni_avx2_skciphers), simd_sm4_aesni_avx2_skciphers); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit sm4_exit(void) diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 14ae012948ae..8a25376d341f 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -461,8 +461,8 @@ simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)]; static int __init sm4_init(void) { const char *feature_name; + int ret; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX) || @@ -478,9 +478,12 @@ static int __init sm4_init(void) return -ENODEV; } - return simd_register_skciphers_compat(sm4_aesni_avx_skciphers, + ret = simd_register_skciphers_compat(sm4_aesni_avx_skciphers, ARRAY_SIZE(sm4_aesni_avx_skciphers), simd_sm4_aesni_avx_skciphers); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit sm4_exit(void) diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 044e4f92e2c0..ccf016bf6ef2 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -117,6 +117,7 @@ static struct simd_skcipher_alg *twofish_simd_algs[ARRAY_SIZE(twofish_algs)]; static int __init twofish_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -126,9 +127,12 @@ static int __init twofish_init(void) return -ENODEV; } - return simd_register_skciphers_compat(twofish_algs, - ARRAY_SIZE(twofish_algs), - twofish_simd_algs); + ret = simd_register_skciphers_compat(twofish_algs, + ARRAY_SIZE(twofish_algs), + twofish_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit twofish_exit(void) diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index 031ed290c755..5756b9cab982 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -92,10 +92,15 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init twofish_glue_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - return crypto_register_alg(&alg); + ret = crypto_register_alg(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit twofish_glue_fini(void) diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 7e2a18e3abe7..2fde637b40c8 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -151,6 +151,8 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init twofish_3way_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -162,8 +164,11 @@ static int __init twofish_3way_init(void) return -ENODEV; } - return crypto_register_skciphers(tf_skciphers, - ARRAY_SIZE(tf_skciphers)); + ret = crypto_register_skciphers(tf_skciphers, + ARRAY_SIZE(tf_skciphers)); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit twofish_3way_fini(void) From patchwork Wed Oct 12 21:59:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B789C43217 for ; Wed, 12 Oct 2022 22:01:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229769AbiJLWBu (ORCPT ); Wed, 12 Oct 2022 18:01:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229908AbiJLWAn (ORCPT ); Wed, 12 Oct 2022 18:00:43 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F58C7654B; Wed, 12 Oct 2022 15:00:22 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL88dt016394; Wed, 12 Oct 2022 22:00:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=cA+jiT1Ykbspsq4IfGAHBDwf5ujiMklaMuWz0ZTfqho=; b=JlnPlMbr7D+Jfwij5Q+naBSj+gztLGJQaolsbe0LmM+o9uGUjg4ry5uzr5m8YzyFwqCf qx07bS3/uudkyA1gfyVaZR9/evrdk/TEibHa9bHD7Oe/eyY3zxcOdxTqe7d1iFnfwxhf /KQJixQ7fDJY+XGT243GGKAMBGtI5DIaLFOF7sLcp63FNp24x1urqOoksGnmJKTMKlWa 2O1RNDHJk0hMqr+HwI2DfOpx+y5TgTjJrv8Wfgncvlqdu5AyKGWdH2eDULjQHS5ZaS2v GYt29VIX/q2dCrO0c8qRwNh9v6a7Tth+z53OmFhSzST/E5vzV8kdGrLcSk0EZTn72VQx 0g== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a7s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:10 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 90B3713970; Wed, 12 Oct 2022 22:00:09 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 1E63D807DA5; Wed, 12 Oct 2022 22:00:09 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 17/19] crypto: x86 - standardize suboptimal prints Date: Wed, 12 Oct 2022 16:59:29 -0500 Message-Id: <20221012215931.3896-18-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: XKLvG6nOYC17q4IoVPFq8vnXPAREF7v6 X-Proofpoint-GUID: XKLvG6nOYC17q4IoVPFq8vnXPAREF7v6 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Reword prints that the module is not being loaded (although it otherwise qualifies) because performance would be suboptimal on the particular CPU model. Although modules are not supposed to print unless they're loaded and active, this is an existing exception. Signed-off-by: Robert Elliott --- arch/x86/crypto/blowfish_glue.c | 5 +---- arch/x86/crypto/camellia_glue.c | 5 +---- arch/x86/crypto/des3_ede_glue.c | 2 +- arch/x86/crypto/twofish_glue_3way.c | 5 +---- 4 files changed, 4 insertions(+), 13 deletions(-) diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 27b7aed9a488..8d4ecf406dee 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -320,10 +320,7 @@ static int __init blowfish_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - printk(KERN_INFO - "blowfish-x86_64: performance on this CPU " - "would be suboptimal: disabling " - "blowfish-x86_64.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index 94dd2973bb47..002a1e84b277 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -1394,10 +1394,7 @@ static int __init camellia_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - printk(KERN_INFO - "camellia-x86_64: performance on this CPU " - "would be suboptimal: disabling " - "camellia-x86_64.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index 7b4dd02007ed..b38ad3ec38e2 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -371,7 +371,7 @@ static int __init des3_ede_x86_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - pr_info("des3_ede-x86_64: performance on this CPU would be suboptimal: disabling des3_ede-x86_64.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 2fde637b40c8..29c35d2aaeba 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -157,10 +157,7 @@ static int __init twofish_3way_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - printk(KERN_INFO - "twofish-x86_64-3way: performance on this CPU " - "would be suboptimal: disabling " - "twofish-x86_64-3way.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } From patchwork Wed Oct 12 21:59:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECFEAC433FE for ; Wed, 12 Oct 2022 22:04:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229972AbiJLWEt (ORCPT ); Wed, 12 Oct 2022 18:04:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229978AbiJLWEB (ORCPT ); Wed, 12 Oct 2022 18:04:01 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B116A13954B; Wed, 12 Oct 2022 15:01:28 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7Wtg016423; Wed, 12 Oct 2022 22:00:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=X7Go4viamzwOn3WUTwlo1LbW+Q0tptmA7a1g3x+mSls=; b=P89Qz77nTHr377CTaQ/w4cfXZnv9P2GTFzorUYCdAaLW32Qi5hAMB4PfaApAqOM+RWxF 8nI7zrA1czGGQBc3YGt2cA6JnciH+Gbw1GHXRFLLupGSzxZxweyF7CPlwRJkv7XvV8mB 8R0w32EgX5W8nusFomefeeoeWXOMzh6S5y3EwWzlliueSEDXbc1KsoVwXSI9ObAJispM U1I0Zx0SbTwNj+jh387WY1LA7nvqHRpUjOqIYINF2JnZYDV6t2E9xiDJncFAw/clHFX+ 53fDY8vxloTFZayxAiee7eslHFFtuyuCUeuvRhHI/QcM6F3AZwBxxhU1mPzNdLd5gjRE jw== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8adb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:11 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id C626129585; Wed, 12 Oct 2022 22:00:10 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 6DADB807DA5; Wed, 12 Oct 2022 22:00:10 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 18/19] crypto: x86 - standardize not loaded prints Date: Wed, 12 Oct 2022 16:59:30 -0500 Message-Id: <20221012215931.3896-19-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 82XBGjj4A8ErvKSGv2QYlO_RI8X_yhn1 X-Proofpoint-GUID: 82XBGjj4A8ErvKSGv2QYlO_RI8X_yhn1 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Standardize the prints that additional required CPU features are not present along with the main CPU features (e.g., OSXSAVE is not present along with AVX). Although modules are not supposed to print unless loaded and active, these are existing exceptions. Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 4 +++- arch/x86/crypto/aria_aesni_avx_glue.c | 4 ++-- arch/x86/crypto/camellia_aesni_avx2_glue.c | 5 +++-- arch/x86/crypto/camellia_aesni_avx_glue.c | 5 +++-- arch/x86/crypto/cast5_avx_glue.c | 3 ++- arch/x86/crypto/cast6_avx_glue.c | 3 ++- arch/x86/crypto/crc32-pclmul_glue.c | 4 +++- arch/x86/crypto/nhpoly1305-avx2-glue.c | 4 +++- arch/x86/crypto/serpent_avx2_glue.c | 8 +++++--- arch/x86/crypto/serpent_avx_glue.c | 3 ++- arch/x86/crypto/sm3_avx_glue.c | 7 ++++--- arch/x86/crypto/sm4_aesni_avx2_glue.c | 5 +++-- arch/x86/crypto/sm4_aesni_avx_glue.c | 5 +++-- arch/x86/crypto/twofish_avx_glue.c | 3 ++- 14 files changed, 40 insertions(+), 23 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index e8eaf79ef220..aa94b9f8703c 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -281,8 +281,10 @@ static int __init crypto_aegis128_aesni_module_init(void) if (!boot_cpu_has(X86_FEATURE_XMM2) || !boot_cpu_has(X86_FEATURE_AES) || - !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) { + pr_info("CPU-optimized crypto module not loaded, all required CPU features (SSE2, AESNI) not supported\n"); return -ENODEV; + } ret = simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, &simd_alg); diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index d58fb995a266..24982450a125 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -176,13 +176,13 @@ static int __init aria_avx_init(void) if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AES-NI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU extended feature '%s' is not supported\n", feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index e6c4ed1e40d2..bc6862077984 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -123,13 +123,14 @@ static int __init camellia_aesni_init(void) !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AVX2, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 6a9eadf0fe90..96e7e1accb6c 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -121,13 +121,14 @@ static int __init camellia_aesni_init(void) if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index b5ae17c3ac53..89650fffb550 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -114,7 +114,8 @@ static int __init cast5_init(void) if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index d1c14a5f80d7..d69f62ac9553 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -114,7 +114,8 @@ static int __init cast6_init(void) if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index c56d3d3ab0a0..4cf86f8f9428 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -192,8 +192,10 @@ static int __init crc32_pclmul_mod_init(void) { int ret; - if (!x86_match_cpu(module_cpu_ids)) + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU feature (PCLMULQDQ) not supported\n"); return -ENODEV; + } ret = crypto_register_shash(&alg); if (!ret) diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 2dc7b618771f..834bf64bb160 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -74,8 +74,10 @@ static int __init nhpoly1305_mod_init(void) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX2) || - !boot_cpu_has(X86_FEATURE_OSXSAVE)) + !boot_cpu_has(X86_FEATURE_OSXSAVE)) { + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX2, OSXSAVE) not supported\n"); return -ENODEV; + } ret = crypto_register_shash(&nhpoly1305_alg); if (!ret) diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index bf59addaf804..4bd59ccea69a 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -112,13 +112,15 @@ static int __init serpent_avx2_init(void) return -ENODEV; - if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 instructions are not detected.\n"); + if (!boot_cpu_has(X86_FEATURE_AVX2) || + !boot_cpu_has(X86_FEATURE_OSXSAVE)) { + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX2, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 7b0c02a61552..853b48677d2b 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -121,7 +121,8 @@ static int __init serpent_init(void) if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 532f07b05745..5250fee79147 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -131,18 +131,19 @@ static int __init sm3_avx_mod_init(void) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX)) { - pr_info("AVX instruction are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, required CPU feature (AVX) not supported\n"); return -ENODEV; } if (!boot_cpu_has(X86_FEATURE_BMI2)) { - pr_info("BMI2 instruction are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, required CPU feature (BMI2) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 42819ee5d36d..cdd7ca92ca61 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -152,13 +152,14 @@ static int __init sm4_init(void) !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AVX2, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 8a25376d341f..a2ae3d1e0a4a 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -468,13 +468,14 @@ static int __init sm4_init(void) if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index ccf016bf6ef2..70167dd01816 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -123,7 +123,8 @@ static int __init twofish_init(void) return -ENODEV; if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } From patchwork Wed Oct 12 21:59:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7831AC433FE for ; Wed, 12 Oct 2022 22:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230087AbiJLWDF (ORCPT ); Wed, 12 Oct 2022 18:03:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230094AbiJLWBj (ORCPT ); Wed, 12 Oct 2022 18:01:39 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8860610324C; Wed, 12 Oct 2022 15:00:27 -0700 (PDT) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKWT9x015442; Wed, 12 Oct 2022 22:00:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=K0yyN1Z9ykifAfRg9N8WShvUGMpa8FIhYC0CUsfvRDw=; b=mVhez82LOF2jhuRvGSI27LUkKOAUZ1b+Edj+9zvfQvsFVE3ngyNXmhmblWCmkUoyVzXx P4Q8i6D3de3Hw9zHBntr2L1AoqeBKOMtukz5H+kYMbH/6Fnog9chP1WlXbydGN/tiLhv e/Yy/WdEbxii64ByH/EoxRuYQXLjHYzASqr2bx1q12AsL6z8wFOGDZdvumKL9GAnQFUb uXkO98EWWUSOgjlzsSqJVqiZftJdj6UXObapdVRgGkHh4XnIBp5ewl7GY3s1Pk5adaYW wA2bGojpRB/fXeN0ES2aO3wdXEjMdDxPO8qLsekzkH5tUHcGhI5ZpV/amfBi4QJPpP04 tQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k615t2eth-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:12 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 013CBD268; Wed, 12 Oct 2022 22:00:11 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 9B9058038C2; Wed, 12 Oct 2022 22:00:11 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 19/19] crypto: x86/sha - register only the best function Date: Wed, 12 Oct 2022 16:59:31 -0500 Message-Id: <20221012215931.3896-20-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: MtKhtP7F9kP6sMI-XKAtztT-zs2OFsZc X-Proofpoint-ORIG-GUID: MtKhtP7F9kP6sMI-XKAtztT-zs2OFsZc X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=999 lowpriorityscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Don't register and unregister each of the functions from least- to most-optimized (SSSE3 then AVX then AVX2); determine the most-optimized function and load only that version. Suggested-by: Tim Chen Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 139 ++++++++++++------------- arch/x86/crypto/sha256_ssse3_glue.c | 154 ++++++++++++++-------------- arch/x86/crypto/sha512_ssse3_glue.c | 120 ++++++++++++---------- 3 files changed, 210 insertions(+), 203 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index edffc33bd12e..90a86d737bcf 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -123,17 +123,16 @@ static struct shash_alg sha1_ssse3_alg = { } }; -static int register_sha1_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shash(&sha1_ssse3_alg); - return 0; -} - +static bool sha1_ssse3_registered; +static bool sha1_avx_registered; +static bool sha1_avx2_registered; +static bool sha1_ni_registered; static void unregister_sha1_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (sha1_ssse3_registered) { crypto_unregister_shash(&sha1_ssse3_alg); + sha1_ssse3_registered = 0; + } } asmlinkage void sha1_transform_avx(struct sha1_state *state, @@ -172,28 +171,12 @@ static struct shash_alg sha1_avx_alg = { } }; -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} - -static int register_sha1_avx(void) -{ - if (avx_usable()) - return crypto_register_shash(&sha1_avx_alg); - return 0; -} - static void unregister_sha1_avx(void) { - if (avx_usable()) + if (sha1_avx_registered) { crypto_unregister_shash(&sha1_avx_alg); + sha1_avx_registered = 0; + } } #define SHA1_AVX2_BLOCK_OPTSIZE 4 /* optimal 4*64 bytes of SHA1 blocks */ @@ -201,16 +184,6 @@ static void unregister_sha1_avx(void) asmlinkage void sha1_transform_avx2(struct sha1_state *state, const u8 *data, int blocks); -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) - && boot_cpu_has(X86_FEATURE_BMI1) - && boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - static void sha1_apply_transform_avx2(struct sha1_state *state, const u8 *data, int blocks) { @@ -254,17 +227,13 @@ static struct shash_alg sha1_avx2_alg = { } }; -static int register_sha1_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shash(&sha1_avx2_alg); - return 0; -} static void unregister_sha1_avx2(void) { - if (avx2_usable()) + if (sha1_avx2_registered) { crypto_unregister_shash(&sha1_avx2_alg); + sha1_avx2_registered = 0; + } } #ifdef CONFIG_AS_SHA1_NI @@ -304,13 +273,6 @@ static struct shash_alg sha1_ni_alg = { } }; -static int register_sha1_ni(void) -{ - if (boot_cpu_has(X86_FEATURE_SHA_NI)) - return crypto_register_shash(&sha1_ni_alg); - return 0; -} - static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), @@ -322,44 +284,79 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static void unregister_sha1_ni(void) { - if (boot_cpu_has(X86_FEATURE_SHA_NI)) + if (sha1_ni_registered) { crypto_unregister_shash(&sha1_ni_alg); + sha1_ni_registered = 0; + } } #else -static inline int register_sha1_ni(void) { return 0; } static inline void unregister_sha1_ni(void) { } #endif static int __init sha1_ssse3_mod_init(void) { - if (register_sha1_ssse3()) - goto fail; + const char *feature_name; + const char *driver_name = NULL; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (register_sha1_avx()) { - unregister_sha1_ssse3(); - goto fail; - } + /* SHA-NI */ + if (boot_cpu_has(X86_FEATURE_SHA_NI)) { - if (register_sha1_avx2()) { - unregister_sha1_avx(); - unregister_sha1_ssse3(); - goto fail; - } + ret = crypto_register_shash(&sha1_ni_alg); + if (!ret) + sha1_ni_registered = 1; - if (register_sha1_ni()) { - unregister_sha1_avx2(); - unregister_sha1_avx(); - unregister_sha1_ssse3(); - goto fail; + /* AVX2 */ + } else if (boot_cpu_has(X86_FEATURE_AVX2)) { + + if (boot_cpu_has(X86_FEATURE_BMI1) && + boot_cpu_has(X86_FEATURE_BMI2)) { + + ret = crypto_register_shash(&sha1_avx2_alg); + if (!ret) { + sha1_avx2_registered = 1; + driver_name = sha1_avx2_alg.base.cra_driver_name; + } + } else { + pr_info("AVX2-optimized version not engaged, all required features (AVX2, BMI1, BMI2) not supported\n"); + } + + /* AVX */ + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + + ret = crypto_register_shash(&sha1_avx_alg); + if (!ret) { + sha1_avx_registered = 1; + driver_name = sha1_avx_alg.base.cra_driver_name; + } + } else { + pr_info("AVX-optimized version not engaged, CPU extended feature '%s' is not supported\n", + feature_name); + } + + /* SSE3 */ + } else if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shash(&sha1_ssse3_alg); + if (!ret) { + sha1_ssse3_registered = 1; + driver_name = sha1_ssse3_alg.base.cra_driver_name; + } } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX=%s, AVX2=%s, SHA-NI=%s): driver=%s\n", + sha1_ssse3_registered ? "yes" : "no", + sha1_avx_registered ? "yes" : "no", + sha1_avx2_registered ? "yes" : "no", + sha1_ni_registered ? "yes" : "no", + driver_name); return 0; -fail: - return -ENODEV; } static void __exit sha1_ssse3_mod_fini(void) diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 8a0fb308fbba..cd7bf2b48f3d 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -150,19 +150,18 @@ static struct shash_alg sha256_ssse3_algs[] = { { } } }; -static int register_sha256_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shashes(sha256_ssse3_algs, - ARRAY_SIZE(sha256_ssse3_algs)); - return 0; -} +static bool sha256_ssse3_registered; +static bool sha256_avx_registered; +static bool sha256_avx2_registered; +static bool sha256_ni_registered; static void unregister_sha256_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (sha256_ssse3_registered) { crypto_unregister_shashes(sha256_ssse3_algs, ARRAY_SIZE(sha256_ssse3_algs)); + sha256_ssse3_registered = 0; + } } asmlinkage void sha256_transform_avx(struct sha256_state *state, @@ -215,30 +214,13 @@ static struct shash_alg sha256_avx_algs[] = { { } } }; -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} - -static int register_sha256_avx(void) -{ - if (avx_usable()) - return crypto_register_shashes(sha256_avx_algs, - ARRAY_SIZE(sha256_avx_algs)); - return 0; -} - static void unregister_sha256_avx(void) { - if (avx_usable()) + if (sha256_avx_registered) { crypto_unregister_shashes(sha256_avx_algs, ARRAY_SIZE(sha256_avx_algs)); + sha256_avx_registered = 0; + } } asmlinkage void sha256_transform_rorx(struct sha256_state *state, @@ -291,28 +273,13 @@ static struct shash_alg sha256_avx2_algs[] = { { } } }; -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) && - boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - -static int register_sha256_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shashes(sha256_avx2_algs, - ARRAY_SIZE(sha256_avx2_algs)); - return 0; -} - static void unregister_sha256_avx2(void) { - if (avx2_usable()) + if (sha256_avx2_registered) { crypto_unregister_shashes(sha256_avx2_algs, ARRAY_SIZE(sha256_avx2_algs)); + sha256_avx2_registered = 0; + } } #ifdef CONFIG_AS_SHA256_NI @@ -375,55 +342,92 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); -static int register_sha256_ni(void) -{ - if (boot_cpu_has(X86_FEATURE_SHA_NI)) - return crypto_register_shashes(sha256_ni_algs, - ARRAY_SIZE(sha256_ni_algs)); - return 0; -} - static void unregister_sha256_ni(void) { - if (boot_cpu_has(X86_FEATURE_SHA_NI)) + if (sha256_ni_registered) { crypto_unregister_shashes(sha256_ni_algs, ARRAY_SIZE(sha256_ni_algs)); + sha256_ni_registered = 0; + } } #else -static inline int register_sha256_ni(void) { return 0; } static inline void unregister_sha256_ni(void) { } #endif static int __init sha256_ssse3_mod_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + const char *feature_name; + const char *driver_name = NULL; + const char *driver_name2 = NULL; + int ret; + + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU features (SSSE3, AVX, AVX2, or SHA-NI) not supported\n"); return -ENODEV; + } - if (register_sha256_ssse3()) - goto fail; + /* SHA-NI */ + if (boot_cpu_has(X86_FEATURE_SHA_NI)) { - if (register_sha256_avx()) { - unregister_sha256_ssse3(); - goto fail; - } + ret = crypto_register_shashes(sha256_ni_algs, + ARRAY_SIZE(sha256_ni_algs)); + if (!ret) { + sha256_ni_registered = 1; + driver_name = sha256_ni_algs[0].base.cra_driver_name; + driver_name2 = sha256_ni_algs[1].base.cra_driver_name; + } - if (register_sha256_avx2()) { - unregister_sha256_avx(); - unregister_sha256_ssse3(); - goto fail; - } + /* AVX2 */ + } else if (boot_cpu_has(X86_FEATURE_AVX2)) { + + if (boot_cpu_has(X86_FEATURE_BMI2)) { + ret = crypto_register_shashes(sha256_avx2_algs, + ARRAY_SIZE(sha256_avx2_algs)); + if (!ret) { + sha256_avx2_registered = 1; + driver_name = sha256_avx2_algs[0].base.cra_driver_name; + driver_name2 = sha256_avx2_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX2-optimized version not engaged, all required CPU features (AVX2, BMI2) not supported\n"); + } - if (register_sha256_ni()) { - unregister_sha256_avx2(); - unregister_sha256_avx(); - unregister_sha256_ssse3(); - goto fail; + /* AVX */ + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + ret = crypto_register_shashes(sha256_avx_algs, + ARRAY_SIZE(sha256_avx_algs)); + if (!ret) { + sha256_avx_registered = 1; + driver_name = sha256_avx_algs[0].base.cra_driver_name; + driver_name2 = sha256_avx_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX-optimized version not engaged, CPU extended feature '%s' is not supported\n", + feature_name); + } + + /* SSE3 */ + } else if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shashes(sha256_ssse3_algs, + ARRAY_SIZE(sha256_ssse3_algs)); + if (!ret) { + sha256_ssse3_registered = 1; + driver_name = sha256_ssse3_algs[0].base.cra_driver_name; + driver_name2 = sha256_ssse3_algs[1].base.cra_driver_name; + } } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX=%s, AVX2=%s, SHA-NI=%s): drivers=%s, %s\n", + sha256_ssse3_registered ? "yes" : "no", + sha256_avx_registered ? "yes" : "no", + sha256_avx2_registered ? "yes" : "no", + sha256_ni_registered ? "yes" : "no", + driver_name, driver_name2); return 0; -fail: - return -ENODEV; } static void __exit sha256_ssse3_mod_fini(void) diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index fd5075a32613..df9f8207cc79 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -149,33 +149,21 @@ static struct shash_alg sha512_ssse3_algs[] = { { } } }; -static int register_sha512_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shashes(sha512_ssse3_algs, - ARRAY_SIZE(sha512_ssse3_algs)); - return 0; -} +static bool sha512_ssse3_registered; +static bool sha512_avx_registered; +static bool sha512_avx2_registered; static void unregister_sha512_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (sha512_ssse3_registered) { crypto_unregister_shashes(sha512_ssse3_algs, ARRAY_SIZE(sha512_ssse3_algs)); + sha512_ssse3_registered = 0; + } } asmlinkage void sha512_transform_avx(struct sha512_state *state, const u8 *data, int blocks); -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) @@ -225,19 +213,13 @@ static struct shash_alg sha512_avx_algs[] = { { } } }; -static int register_sha512_avx(void) -{ - if (avx_usable()) - return crypto_register_shashes(sha512_avx_algs, - ARRAY_SIZE(sha512_avx_algs)); - return 0; -} - static void unregister_sha512_avx(void) { - if (avx_usable()) + if (sha512_avx_registered) { crypto_unregister_shashes(sha512_avx_algs, ARRAY_SIZE(sha512_avx_algs)); + sha512_avx_registered = 0; + } } asmlinkage void sha512_transform_rorx(struct sha512_state *state, @@ -291,22 +273,6 @@ static struct shash_alg sha512_avx2_algs[] = { { } } }; -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) && - boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - -static int register_sha512_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shashes(sha512_avx2_algs, - ARRAY_SIZE(sha512_avx2_algs)); - return 0; -} static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), @@ -317,33 +283,73 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static void unregister_sha512_avx2(void) { - if (avx2_usable()) + if (sha512_avx2_registered) { crypto_unregister_shashes(sha512_avx2_algs, ARRAY_SIZE(sha512_avx2_algs)); + sha512_avx2_registered = 0; + } } static int __init sha512_ssse3_mod_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + const char *feature_name; + const char *driver_name = NULL; + const char *driver_name2 = NULL; + int ret; + + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU features (SSSE3, AVX, or AVX2) not supported\n"); return -ENODEV; + } - if (register_sha512_ssse3()) - goto fail; + /* AVX2 */ + if (boot_cpu_has(X86_FEATURE_AVX2)) { + if (boot_cpu_has(X86_FEATURE_BMI2)) { + ret = crypto_register_shashes(sha512_avx2_algs, + ARRAY_SIZE(sha512_avx2_algs)); + if (!ret) { + sha512_avx2_registered = 1; + driver_name = sha512_avx2_algs[0].base.cra_driver_name; + driver_name2 = sha512_avx2_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX2-optimized version not engaged, all required CPU features (AVX2, BMI2) not supported\n"); + } - if (register_sha512_avx()) { - unregister_sha512_ssse3(); - goto fail; - } + /* AVX */ + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + ret = crypto_register_shashes(sha512_avx_algs, + ARRAY_SIZE(sha512_avx_algs)); + if (!ret) { + sha512_avx_registered = 1; + driver_name = sha512_avx_algs[0].base.cra_driver_name; + driver_name2 = sha512_avx_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX-optimized version not engaged, CPU extended feature '%s' is not supported\n", + feature_name); + } - if (register_sha512_avx2()) { - unregister_sha512_avx(); - unregister_sha512_ssse3(); - goto fail; + /* SSE3 */ + } else if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shashes(sha512_ssse3_algs, + ARRAY_SIZE(sha512_ssse3_algs)); + if (!ret) { + sha512_ssse3_registered = 1; + driver_name = sha512_ssse3_algs[0].base.cra_driver_name; + driver_name2 = sha512_ssse3_algs[1].base.cra_driver_name; + } } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX=%s, AVX2=%s): drivers=%s, %s\n", + sha512_ssse3_registered ? "yes" : "no", + sha512_avx_registered ? "yes" : "no", + sha512_avx2_registered ? "yes" : "no", + driver_name, driver_name2); return 0; -fail: - return -ENODEV; } static void __exit sha512_ssse3_mod_fini(void)