From patchwork Wed Nov 16 04:13:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B075C43219 for ; Wed, 16 Nov 2022 04:14:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231617AbiKPEOO (ORCPT ); Tue, 15 Nov 2022 23:14:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230136AbiKPEOM (ORCPT ); Tue, 15 Nov 2022 23:14:12 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F9381DF33; Tue, 15 Nov 2022 20:14:11 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG31MdE027556; Wed, 16 Nov 2022 04:13:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=OQQdqDDbTZo3g2+8cfmGfikrPh68vOUttFSoj8wftHw=; b=I1XsPFNR/SjuWMvT/gvlpEIPKJiMDMnVifggwyfN+2z+hLbruihk0N2DpR3ycva+jS3z 8iIZ8V3rev+FZ7nHeV8gB9dUcU7zBG+dHIM6HM8z9CL7qdrKkyVTAhLlZ3cA/pOhdlqp AA4PvJKGF/g1RI4UMEG/gEgG7ibcXpRYcYJS+P2jx1hJkPEnppM6Z5LTTI5OoJDy1LdG eOjGJo3ay3sNUmUpbjaiBrE6IBtqmlbkvmilnpfxadbO1cgxTV3fWERIaB8iBEu1HFYi ConBT0K0eRhtr/TVfeHEiL/DKvNQeMh4f6DjOJ5ATjRP1oPukIfuabENvJbVpzMM9Ja4 tA== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqkbreeh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:13:55 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 8C9C9809F55; Wed, 16 Nov 2022 04:13:54 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 07DE6808B9A; Wed, 16 Nov 2022 04:13:53 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 01/24] crypto: tcrypt - test crc32 Date: Tue, 15 Nov 2022 22:13:19 -0600 Message-Id: <20221116041342.3841-2-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: tKkXXZb93fsG73gPIS7Xx7aTQbp_KSIH X-Proofpoint-ORIG-GUID: tKkXXZb93fsG73gPIS7Xx7aTQbp_KSIH X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=920 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add self-test and speed tests for crc32, paralleling those offered for crc32c and crct10dif. Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index a82679b576bb..4426386dfb42 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1711,6 +1711,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) ret += tcrypt_test("gcm(aria)"); break; + case 59: + ret += tcrypt_test("crc32"); + break; + case 100: ret += tcrypt_test("hmac(md5)"); break; @@ -2317,6 +2321,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) generic_hash_speed_template); if (mode > 300 && mode < 400) break; fallthrough; + case 329: + test_hash_speed("crc32", sec, generic_hash_speed_template); + if (mode > 300 && mode < 400) break; + fallthrough; case 399: break; From patchwork Wed Nov 16 04:13:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7EBFC4332F for ; Wed, 16 Nov 2022 04:14:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231573AbiKPEON (ORCPT ); Tue, 15 Nov 2022 23:14:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230247AbiKPEOM (ORCPT ); Tue, 15 Nov 2022 23:14:12 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D169C2CDCC; Tue, 15 Nov 2022 20:14:11 -0800 (PST) Received: from pps.filterd (m0150242.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3cx7C007106; Wed, 16 Nov 2022 04:13:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=CY/C6YFYVZFR5UoI13Eo6J2yfeNTSSmCxRx7EfVHqbg=; b=fM9h6dJy+AdO5AcTEYyp6YpcnBnk9ZN608lQM0gqT9OkHvlZLsP8W80kZtnfMBDgCjlg VxrYDnf0YEf1WI298AGyfOnU+bsqxb5eycRuflv1doUoYRQjxRdiyrFVaCkxxt9JnVMu WS+3LIbq1PYZs5V69LvVRQwUkGm4pSsW5Zyaf6XAQPxSkidvlpNS04/AIywvaYXcHUTj KiczeHOKz4jQeZ5m7x5Tjp/3g3VaMub0JPIsVJB9Qn/V2paJSTG/xqe2M8KkkM9BQ3+n aJAJqRjWPw00zxYBBMEQGoJ4A4kXLod++NHzv85XUXOa01cgdkoeWK0VZ2HqNktxMD8p jQ== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3kvr5486sb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:13:56 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id E8F04809F56; Wed, 16 Nov 2022 04:13:55 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 6BA9E802A17; Wed, 16 Nov 2022 04:13:55 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 02/24] crypto: tcrypt - test nhpoly1305 Date: Tue, 15 Nov 2022 22:13:20 -0600 Message-Id: <20221116041342.3841-3-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 4A3ZRqi_ZK-xEomLs7JnoQpwLFXvTwK7 X-Proofpoint-GUID: 4A3ZRqi_ZK-xEomLs7JnoQpwLFXvTwK7 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 malwarescore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 spamscore=0 bulkscore=0 mlxlogscore=896 phishscore=0 impostorscore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add self-test mode for nhpoly1305. Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 4426386dfb42..7a6a56751043 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1715,6 +1715,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) ret += tcrypt_test("crc32"); break; + case 60: + ret += tcrypt_test("nhpoly1305"); + break; + case 100: ret += tcrypt_test("hmac(md5)"); break; From patchwork Wed Nov 16 04:13:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625254 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B022C43217 for ; Wed, 16 Nov 2022 04:14:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231806AbiKPEOW (ORCPT ); Tue, 15 Nov 2022 23:14:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiKPEOR (ORCPT ); Tue, 15 Nov 2022 23:14:17 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 875551E3D2; Tue, 15 Nov 2022 20:14:15 -0800 (PST) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3NI1w017091; Wed, 16 Nov 2022 04:13:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=3j25tB+SFVCOvqm6qX5iPWKXOvcRNIYcxNjwjiDX4a8=; b=LRUZs/0SaEY+kayj+DDxTXlbmF3wIv/oFNc2q24RoGuzdRHS1niYQLueLGhgvG1In5V9 CWuBQnfObsWSPGMJxBydQa1drUh93qIUO3bgwk5uxKJhveDFg0LiDgZ+QxJMz+pYMnxC ccmKmchjru+PtYb/9RgSuqgTAJ+eu3jtGhG76VXTqEcc8doCzg03H3a7UIMScdIrGnau TkRfYiSFyqjvWFdXCCSHBvaShTeCJRnZ9nT1A87Mg/21FzVl8cinnpo0tgsvOZLAH5B0 wqscSZm2siUvCYkiERapmzWrpPPAWLPgUBvF5pl5hBH/0YzwZ+f31djMfz97f90MJvSP qA== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqwa0ahg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:13:57 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 18781295AF; Wed, 16 Nov 2022 04:13:57 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 9AF4E8065DB; Wed, 16 Nov 2022 04:13:56 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 03/24] crypto: tcrypt - reschedule during cycles speed tests Date: Tue, 15 Nov 2022 22:13:21 -0600 Message-Id: <20221116041342.3841-4-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: w3xWmJxEGL3OmJ9_6so6C6l3Ufati3Qh X-Proofpoint-ORIG-GUID: w3xWmJxEGL3OmJ9_6so6C6l3Ufati3Qh X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 adultscore=0 phishscore=0 suspectscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 mlxlogscore=999 spamscore=0 lowpriorityscore=0 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org commit 2af632996b89 ("crypto: tcrypt - reschedule during speed tests") added cond_resched() calls to "Avoid RCU stalls in the case of non-preemptible kernel and lengthy speed tests by rescheduling when advancing from one block size to another." It only makes those calls if the sec module parameter is used (run the speed test for a certain number of seconds), not the default "cycles" mode. Expand those to also run in "cycles" mode to reduce the rate of rcu stall warnings: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: Suggested-by: Herbert Xu Tested-by: Taehee Yoo Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 44 ++++++++++++++++++-------------------------- 1 file changed, 18 insertions(+), 26 deletions(-) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 7a6a56751043..c025ba26b663 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -408,14 +408,13 @@ static void test_mb_aead_speed(const char *algo, int enc, int secs, } - if (secs) { + if (secs) ret = test_mb_aead_jiffies(data, enc, bs, secs, num_mb); - cond_resched(); - } else { + else ret = test_mb_aead_cycles(data, enc, bs, num_mb); - } + cond_resched(); if (ret) { pr_err("%s() failed return code=%d\n", e, ret); @@ -661,13 +660,11 @@ static void test_aead_speed(const char *algo, int enc, unsigned int secs, bs + (enc ? 0 : authsize), iv); - if (secs) { - ret = test_aead_jiffies(req, enc, bs, - secs); - cond_resched(); - } else { + if (secs) + ret = test_aead_jiffies(req, enc, bs, secs); + else ret = test_aead_cycles(req, enc, bs); - } + cond_resched(); if (ret) { pr_err("%s() failed return code=%d\n", e, ret); @@ -917,14 +914,13 @@ static void test_ahash_speed_common(const char *algo, unsigned int secs, ahash_request_set_crypt(req, sg, output, speed[i].plen); - if (secs) { + if (secs) ret = test_ahash_jiffies(req, speed[i].blen, speed[i].plen, output, secs); - cond_resched(); - } else { + else ret = test_ahash_cycles(req, speed[i].blen, speed[i].plen, output); - } + cond_resched(); if (ret) { pr_err("hashing failed ret=%d\n", ret); @@ -1184,15 +1180,14 @@ static void test_mb_skcipher_speed(const char *algo, int enc, int secs, cur->sg, bs, iv); } - if (secs) { + if (secs) ret = test_mb_acipher_jiffies(data, enc, bs, secs, num_mb); - cond_resched(); - } else { + else ret = test_mb_acipher_cycles(data, enc, bs, num_mb); - } + cond_resched(); if (ret) { pr_err("%s() failed flags=%x\n", e, @@ -1401,14 +1396,11 @@ static void test_skcipher_speed(const char *algo, int enc, unsigned int secs, skcipher_request_set_crypt(req, sg, sg, bs, iv); - if (secs) { - ret = test_acipher_jiffies(req, enc, - bs, secs); - cond_resched(); - } else { - ret = test_acipher_cycles(req, enc, - bs); - } + if (secs) + ret = test_acipher_jiffies(req, enc, bs, secs); + else + ret = test_acipher_cycles(req, enc, bs); + cond_resched(); if (ret) { pr_err("%s() failed flags=%x\n", e, From patchwork Wed Nov 16 04:13:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625842 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C8BFC4332F for ; Wed, 16 Nov 2022 04:14:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231327AbiKPEOb (ORCPT ); Tue, 15 Nov 2022 23:14:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231678AbiKPEOV (ORCPT ); Tue, 15 Nov 2022 23:14:21 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCBA431EFA; Tue, 15 Nov 2022 20:14:18 -0800 (PST) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3RUTc013874; Wed, 16 Nov 2022 04:13:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=JLi1pqnZUODkLCm7SUQuuKGdHYscWYh21IabB9tftms=; b=WC70eAGL2IOGKa+Ve/ihe/kM+8DB57Q9/FathLVv3F6YM8DWhgqwyVeSoQS0obusd0Eb 41dExgTVQMhUoKkfFmfu0HIimq8ksgjsUKAmI70jAq4VYBblR++N+cpDn+L+aYtp8h4R XjfTwqvrvLOOpVvf99J/d2i0tameakQt49XwqdwKTR3koB/YCy4BbYkaczPWU/X3e6nn q9RX4woHybqSf5cIg+ryzcLVKcJYs36b0cwca9RvLO02rxq5OpuB8WTk37V1xmcnA/Dq OuSN5C+ALAMgA/D3kJ5qMjR+hfRBQqqRnnuVJTsUz95jXaPpDQp7fyOEZazuoCxliK26 rg== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqyfr9hh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:13:59 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 385C34B5CE; Wed, 16 Nov 2022 04:13:58 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id BC9CD802DD6; Wed, 16 Nov 2022 04:13:57 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 04/24] crypto: x86/sha - limit FPU preemption Date: Tue, 15 Nov 2022 22:13:22 -0600 Message-Id: <20221116041342.3841-5-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: C2NCv8BnIiHVn237aUl32a4qJjW2A1tT X-Proofpoint-ORIG-GUID: C2NCv8BnIiHVn237aUl32a4qJjW2A1tT X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 lowpriorityscore=0 phishscore=0 adultscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running. This leads to "rcu_preempt detected expedited stalls" with stack dumps pointing to the optimized hash function if the module is loaded and used a lot: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... For example, that can occur during boot with the stack track pointing to the sha512-x86 function if the system set to use SHA-512 for module signing. The call trace includes: module_sig_check mod_verify_sig pkcs7_verify pkcs7_digest sha512_finup sha512_base_do_update Fixes: 66be89515888 ("crypto: sha1 - SSSE3 based SHA1 implementation for x86-64") Fixes: 8275d1aa6422 ("crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.") Fixes: 87de4579f92d ("crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.") Fixes: aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") Suggested-by: Herbert Xu Reviewed-by: Tim Chen Signed-off-by: Robert Elliott --- v3 simplify to while loops rather than do..while loops, avoid redundant checks for zero length, rename the limit macro and change into a const, vary the limit for each algo --- arch/x86/crypto/sha1_ssse3_glue.c | 64 ++++++++++++++++++++++------- arch/x86/crypto/sha256_ssse3_glue.c | 64 ++++++++++++++++++++++------- arch/x86/crypto/sha512_ssse3_glue.c | 55 +++++++++++++++++++------ 3 files changed, 140 insertions(+), 43 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 44340a1139e0..4bc77c84b0fb 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -26,8 +26,17 @@ #include #include +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +#ifdef CONFIG_AS_SHA1_NI +static const unsigned int bytes_per_fpu_shani = 34 * 1024; +#endif +static const unsigned int bytes_per_fpu_avx2 = 34 * 1024; +static const unsigned int bytes_per_fpu_avx = 30 * 1024; +static const unsigned int bytes_per_fpu_ssse3 = 26 * 1024; + static int sha1_update(struct shash_desc *desc, const u8 *data, - unsigned int len, sha1_block_fn *sha1_xform) + unsigned int len, unsigned int bytes_per_fpu, + sha1_block_fn *sha1_xform) { struct sha1_state *sctx = shash_desc_ctx(desc); @@ -41,22 +50,39 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0); - kernel_fpu_begin(); - sha1_base_do_update(desc, data, len, sha1_xform); - kernel_fpu_end(); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } return 0; } static int sha1_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out, sha1_block_fn *sha1_xform) + unsigned int len, unsigned int bytes_per_fpu, + u8 *out, sha1_block_fn *sha1_xform) { if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } + kernel_fpu_begin(); - if (len) - sha1_base_do_update(desc, data, len, sha1_xform); sha1_base_do_finalize(desc, sha1_xform); kernel_fpu_end(); @@ -69,13 +95,15 @@ asmlinkage void sha1_transform_ssse3(struct sha1_state *state, static int sha1_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha1_update(desc, data, len, sha1_transform_ssse3); + return sha1_update(desc, data, len, bytes_per_fpu_ssse3, + sha1_transform_ssse3); } static int sha1_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_transform_ssse3); + return sha1_finup(desc, data, len, bytes_per_fpu_ssse3, out, + sha1_transform_ssse3); } /* Add padding and return the message digest. */ @@ -119,13 +147,15 @@ asmlinkage void sha1_transform_avx(struct sha1_state *state, static int sha1_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha1_update(desc, data, len, sha1_transform_avx); + return sha1_update(desc, data, len, bytes_per_fpu_avx, + sha1_transform_avx); } static int sha1_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_transform_avx); + return sha1_finup(desc, data, len, bytes_per_fpu_avx, out, + sha1_transform_avx); } static int sha1_avx_final(struct shash_desc *desc, u8 *out) @@ -201,13 +231,15 @@ static void sha1_apply_transform_avx2(struct sha1_state *state, static int sha1_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha1_update(desc, data, len, sha1_apply_transform_avx2); + return sha1_update(desc, data, len, bytes_per_fpu_avx2, + sha1_apply_transform_avx2); } static int sha1_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_apply_transform_avx2); + return sha1_finup(desc, data, len, bytes_per_fpu_avx2, out, + sha1_apply_transform_avx2); } static int sha1_avx2_final(struct shash_desc *desc, u8 *out) @@ -251,13 +283,15 @@ asmlinkage void sha1_ni_transform(struct sha1_state *digest, const u8 *data, static int sha1_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha1_update(desc, data, len, sha1_ni_transform); + return sha1_update(desc, data, len, bytes_per_fpu_shani, + sha1_ni_transform); } static int sha1_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_ni_transform); + return sha1_finup(desc, data, len, bytes_per_fpu_shani, out, + sha1_ni_transform); } static int sha1_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 3a5f6be7dbba..cdcdf5a80ffe 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -40,11 +40,20 @@ #include #include +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +#ifdef CONFIG_AS_SHA256_NI +static const unsigned int bytes_per_fpu_shani = 13 * 1024; +#endif +static const unsigned int bytes_per_fpu_avx2 = 13 * 1024; +static const unsigned int bytes_per_fpu_avx = 11 * 1024; +static const unsigned int bytes_per_fpu_ssse3 = 11 * 1024; + asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); static int _sha256_update(struct shash_desc *desc, const u8 *data, - unsigned int len, sha256_block_fn *sha256_xform) + unsigned int len, unsigned int bytes_per_fpu, + sha256_block_fn *sha256_xform) { struct sha256_state *sctx = shash_desc_ctx(desc); @@ -58,22 +67,39 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); - kernel_fpu_begin(); - sha256_base_do_update(desc, data, len, sha256_xform); - kernel_fpu_end(); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } return 0; } static int sha256_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out, sha256_block_fn *sha256_xform) + unsigned int len, unsigned int bytes_per_fpu, + u8 *out, sha256_block_fn *sha256_xform) { if (!crypto_simd_usable()) return crypto_sha256_finup(desc, data, len, out); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } + kernel_fpu_begin(); - if (len) - sha256_base_do_update(desc, data, len, sha256_xform); sha256_base_do_finalize(desc, sha256_xform); kernel_fpu_end(); @@ -83,13 +109,15 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_transform_ssse3); + return _sha256_update(desc, data, len, bytes_per_fpu_ssse3, + sha256_transform_ssse3); } static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_transform_ssse3); + return sha256_finup(desc, data, len, bytes_per_fpu_ssse3, + out, sha256_transform_ssse3); } /* Add padding and return the message digest. */ @@ -149,13 +177,15 @@ asmlinkage void sha256_transform_avx(struct sha256_state *state, static int sha256_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_transform_avx); + return _sha256_update(desc, data, len, bytes_per_fpu_avx, + sha256_transform_avx); } static int sha256_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_transform_avx); + return sha256_finup(desc, data, len, bytes_per_fpu_avx, + out, sha256_transform_avx); } static int sha256_avx_final(struct shash_desc *desc, u8 *out) @@ -225,13 +255,15 @@ asmlinkage void sha256_transform_rorx(struct sha256_state *state, static int sha256_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_transform_rorx); + return _sha256_update(desc, data, len, bytes_per_fpu_avx2, + sha256_transform_rorx); } static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_transform_rorx); + return sha256_finup(desc, data, len, bytes_per_fpu_avx2, + out, sha256_transform_rorx); } static int sha256_avx2_final(struct shash_desc *desc, u8 *out) @@ -300,13 +332,15 @@ asmlinkage void sha256_ni_transform(struct sha256_state *digest, static int sha256_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_ni_transform); + return _sha256_update(desc, data, len, bytes_per_fpu_shani, + sha256_ni_transform); } static int sha256_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_ni_transform); + return sha256_finup(desc, data, len, bytes_per_fpu_shani, + out, sha256_ni_transform); } static int sha256_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 6d3b85e53d0e..c7036cfe2a7e 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -39,11 +39,17 @@ #include #include +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu_avx2 = 20 * 1024; +static const unsigned int bytes_per_fpu_avx = 17 * 1024; +static const unsigned int bytes_per_fpu_ssse3 = 17 * 1024; + asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); static int sha512_update(struct shash_desc *desc, const u8 *data, - unsigned int len, sha512_block_fn *sha512_xform) + unsigned int len, unsigned int bytes_per_fpu, + sha512_block_fn *sha512_xform) { struct sha512_state *sctx = shash_desc_ctx(desc); @@ -57,22 +63,39 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); - kernel_fpu_begin(); - sha512_base_do_update(desc, data, len, sha512_xform); - kernel_fpu_end(); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } return 0; } static int sha512_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out, sha512_block_fn *sha512_xform) + unsigned int len, unsigned int bytes_per_fpu, + u8 *out, sha512_block_fn *sha512_xform) { if (!crypto_simd_usable()) return crypto_sha512_finup(desc, data, len, out); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } + kernel_fpu_begin(); - if (len) - sha512_base_do_update(desc, data, len, sha512_xform); sha512_base_do_finalize(desc, sha512_xform); kernel_fpu_end(); @@ -82,13 +105,15 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, static int sha512_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha512_update(desc, data, len, sha512_transform_ssse3); + return sha512_update(desc, data, len, bytes_per_fpu_ssse3, + sha512_transform_ssse3); } static int sha512_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha512_finup(desc, data, len, out, sha512_transform_ssse3); + return sha512_finup(desc, data, len, bytes_per_fpu_ssse3, + out, sha512_transform_ssse3); } /* Add padding and return the message digest. */ @@ -158,13 +183,15 @@ static bool avx_usable(void) static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha512_update(desc, data, len, sha512_transform_avx); + return sha512_update(desc, data, len, bytes_per_fpu_avx, + sha512_transform_avx); } static int sha512_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha512_finup(desc, data, len, out, sha512_transform_avx); + return sha512_finup(desc, data, len, bytes_per_fpu_avx, + out, sha512_transform_avx); } /* Add padding and return the message digest. */ @@ -224,13 +251,15 @@ asmlinkage void sha512_transform_rorx(struct sha512_state *state, static int sha512_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha512_update(desc, data, len, sha512_transform_rorx); + return sha512_update(desc, data, len, bytes_per_fpu_avx2, + sha512_transform_rorx); } static int sha512_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha512_finup(desc, data, len, out, sha512_transform_rorx); + return sha512_finup(desc, data, len, bytes_per_fpu_avx2, + out, sha512_transform_rorx); } /* Add padding and return the message digest. */ From patchwork Wed Nov 16 04:13:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B46DC433FE for ; Wed, 16 Nov 2022 04:14:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231410AbiKPEO5 (ORCPT ); Tue, 15 Nov 2022 23:14:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231431AbiKPEOR (ORCPT ); Tue, 15 Nov 2022 23:14:17 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DE63317F4; Tue, 15 Nov 2022 20:14:15 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3Vv6B008016; Wed, 16 Nov 2022 04:14:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=AS6NRcgOMKJV9Uya+6n2lqXyza3d0a7SOYRhtlczmpk=; b=BZySKXnr+E7B9UpxuAbyTnO+/DQaBNtYRjj/87e+8XJ++tbuYL1e3LDZLWqX/mG1OsQ6 xB5cjyWN2UebHRA7dTEJNeiHEP2J81bvlb2yTs2TEAKgEarEbXcCnc8tAcZvIZZ9QzKm 88MnSvpOPZdY/dyqYBdnTRnDLCDIhDyCs+pr4ZQruj+l7UVCXUcX014w/Qs4eux/C86h EfuRxUvbvrN0y3cxzAQ5Pxr+KHFC+8G9eTYhiz3UXxCJBByU1Z4eT0qwykPekPNU+jv6 9PgxzYbUCPOOd23iwIENU+3Hf615MrKLVEvx/4VljxfVlE9nBiH2v2ExxX9A/tugk6KD vw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqkbref5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:00 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 845922EECF; Wed, 16 Nov 2022 04:13:59 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 1497B808BA7; Wed, 16 Nov 2022 04:13:59 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 05/24] crypto: x86/crc - limit FPU preemption Date: Tue, 15 Nov 2022 22:13:23 -0600 Message-Id: <20221116041342.3841-6-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: QO_FEhQK1pIWU3aLh5WVSufmvR6eO__7 X-Proofpoint-ORIG-GUID: QO_FEhQK1pIWU3aLh5WVSufmvR6eO__7 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 78c37d191dd6 ("crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation") Fixes: 6a8ce1ef3940 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction") Fixes: 0b95a7f85718 ("crypto: crct10dif - Glue code to cast accelerated CRCT10DIF assembly as a crypto transform") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- v3 use while loops and static int, simplify one of the loop structures, add algorithm-specific limits, use local stack variable in crc32 finup rather than the context pointer like update uses --- arch/x86/crypto/crc32-pclmul_asm.S | 6 +-- arch/x86/crypto/crc32-pclmul_glue.c | 27 +++++++++---- arch/x86/crypto/crc32c-intel_glue.c | 52 ++++++++++++++++++------- arch/x86/crypto/crct10dif-pclmul_glue.c | 48 +++++++++++++++++------ 4 files changed, 99 insertions(+), 34 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_asm.S b/arch/x86/crypto/crc32-pclmul_asm.S index ca53e96996ac..9abd861636c3 100644 --- a/arch/x86/crypto/crc32-pclmul_asm.S +++ b/arch/x86/crypto/crc32-pclmul_asm.S @@ -72,15 +72,15 @@ .text /** * Calculate crc32 - * BUF - buffer (16 bytes aligned) - * LEN - sizeof buffer (16 bytes aligned), LEN should be grater than 63 + * BUF - buffer - must be 16 bytes aligned + * LEN - sizeof buffer - must be multiple of 16 bytes and greater than 63 * CRC - initial crc32 * return %eax crc32 * uint crc32_pclmul_le_16(unsigned char const *buffer, * size_t len, uint crc32) */ -SYM_FUNC_START(crc32_pclmul_le_16) /* buffer and buffer size are 16 bytes aligned */ +SYM_FUNC_START(crc32_pclmul_le_16) movdqa (BUF), %xmm1 movdqa 0x10(BUF), %xmm2 movdqa 0x20(BUF), %xmm3 diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 98cf3b4e4c9f..df3dbc754818 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -46,6 +46,9 @@ #define SCALE_F 16L /* size of xmm register */ #define SCALE_F_MASK (SCALE_F - 1) +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 655 * 1024; + u32 crc32_pclmul_le_16(unsigned char const *buffer, size_t len, u32 crc32); static u32 __attribute__((pure)) @@ -55,6 +58,9 @@ static u32 __attribute__((pure)) unsigned int iremainder; unsigned int prealign; + BUILD_BUG_ON(bytes_per_fpu < PCLMUL_MIN_LEN); + BUILD_BUG_ON(bytes_per_fpu & SCALE_F_MASK); + if (len < PCLMUL_MIN_LEN + SCALE_F_MASK || !crypto_simd_usable()) return crc32_le(crc, p, len); @@ -70,12 +76,19 @@ static u32 __attribute__((pure)) iquotient = len & (~SCALE_F_MASK); iremainder = len & SCALE_F_MASK; - kernel_fpu_begin(); - crc = crc32_pclmul_le_16(p, iquotient, crc); - kernel_fpu_end(); + while (iquotient >= PCLMUL_MIN_LEN) { + unsigned int chunk = min(iquotient, bytes_per_fpu); + + kernel_fpu_begin(); + crc = crc32_pclmul_le_16(p, chunk, crc); + kernel_fpu_end(); + + iquotient -= chunk; + p += chunk; + } - if (iremainder) - crc = crc32_le(crc, p + iquotient, iremainder); + if (iquotient || iremainder) + crc = crc32_le(crc, p, iquotient + iremainder); return crc; } @@ -120,8 +133,8 @@ static int crc32_pclmul_update(struct shash_desc *desc, const u8 *data, } /* No final XOR 0xFFFFFFFF, like crc32_le */ -static int __crc32_pclmul_finup(u32 *crcp, const u8 *data, unsigned int len, - u8 *out) +static int __crc32_pclmul_finup(const u32 *crcp, const u8 *data, + unsigned int len, u8 *out) { *(__le32 *)out = cpu_to_le32(crc32_pclmul_le(*crcp, data, len)); return 0; diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index feccb5254c7e..f08ed68ec93d 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -45,7 +45,10 @@ asmlinkage unsigned int crc_pcl(const u8 *buffer, int len, unsigned int crc_init); #endif /* CONFIG_X86_64 */ -static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t length) +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 868 * 1024; + +static u32 crc32c_intel_le_hw_byte(u32 crc, const unsigned char *data, size_t length) { while (length--) { asm("crc32b %1, %0" @@ -56,7 +59,7 @@ static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t le return crc; } -static u32 __pure crc32c_intel_le_hw(u32 crc, unsigned char const *p, size_t len) +static u32 __pure crc32c_intel_le_hw(u32 crc, const unsigned char *p, size_t len) { unsigned int iquotient = len / SCALE_F; unsigned int iremainder = len % SCALE_F; @@ -110,8 +113,8 @@ static int crc32c_intel_update(struct shash_desc *desc, const u8 *data, return 0; } -static int __crc32c_intel_finup(u32 *crcp, const u8 *data, unsigned int len, - u8 *out) +static int __crc32c_intel_finup(const u32 *crcp, const u8 *data, + unsigned int len, u8 *out) { *(__le32 *)out = ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); return 0; @@ -153,29 +156,52 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data, { u32 *crcp = shash_desc_ctx(desc); + BUILD_BUG_ON(bytes_per_fpu < CRC32C_PCL_BREAKEVEN); + BUILD_BUG_ON(bytes_per_fpu % SCALE_F); + /* * use faster PCL version if datasize is large enough to * overcome kernel fpu state save/restore overhead */ if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *crcp = crc_pcl(data, len, *crcp); - kernel_fpu_end(); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } } else *crcp = crc32c_intel_le_hw(*crcp, data, len); return 0; } -static int __crc32c_pcl_intel_finup(u32 *crcp, const u8 *data, unsigned int len, - u8 *out) +static int __crc32c_pcl_intel_finup(const u32 *crcp, const u8 *data, + unsigned int len, u8 *out) { + u32 crc = *crcp; + + BUILD_BUG_ON(bytes_per_fpu < CRC32C_PCL_BREAKEVEN); + BUILD_BUG_ON(bytes_per_fpu % SCALE_F); + if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__le32 *)out = ~cpu_to_le32(crc_pcl(data, len, *crcp)); - kernel_fpu_end(); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + crc = crc_pcl(data, chunk, crc); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } + *(__le32 *)out = ~cpu_to_le32(crc); } else *(__le32 *)out = - ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); + ~cpu_to_le32(crc32c_intel_le_hw(crc, data, len)); return 0; } diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 71291d5af9f4..4f6b8c727d88 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -34,6 +34,11 @@ #include #include +#define PCLMUL_MIN_LEN 16U /* minimum size of buffer for crc_t10dif_pcl */ + +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 614 * 1024; + asmlinkage u16 crc_t10dif_pcl(u16 init_crc, const u8 *buf, size_t len); struct chksum_desc_ctx { @@ -54,11 +59,21 @@ static int chksum_update(struct shash_desc *desc, const u8 *data, { struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); - if (length >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - ctx->crc = crc_t10dif_pcl(ctx->crc, data, length); - kernel_fpu_end(); - } else + BUILD_BUG_ON(bytes_per_fpu < PCLMUL_MIN_LEN); + + if (length >= PCLMUL_MIN_LEN && crypto_simd_usable()) { + while (length >= PCLMUL_MIN_LEN) { + unsigned int chunk = min(length, bytes_per_fpu); + + kernel_fpu_begin(); + ctx->crc = crc_t10dif_pcl(ctx->crc, data, chunk); + kernel_fpu_end(); + + length -= chunk; + data += chunk; + } + } + if (length) ctx->crc = crc_t10dif_generic(ctx->crc, data, length); return 0; } @@ -73,12 +88,23 @@ static int chksum_final(struct shash_desc *desc, u8 *out) static int __chksum_finup(__u16 crc, const u8 *data, unsigned int len, u8 *out) { - if (len >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__u16 *)out = crc_t10dif_pcl(crc, data, len); - kernel_fpu_end(); - } else - *(__u16 *)out = crc_t10dif_generic(crc, data, len); + BUILD_BUG_ON(bytes_per_fpu < PCLMUL_MIN_LEN); + + if (len >= PCLMUL_MIN_LEN && crypto_simd_usable()) { + while (len >= PCLMUL_MIN_LEN) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + crc = crc_t10dif_pcl(crc, data, chunk); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } + } + if (len) + crc = crc_t10dif_generic(crc, data, len); + *(__u16 *)out = crc; return 0; } From patchwork Wed Nov 16 04:13:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CBE5C433FE for ; Wed, 16 Nov 2022 04:14:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231818AbiKPEOd (ORCPT ); Tue, 15 Nov 2022 23:14:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231866AbiKPEOV (ORCPT ); Tue, 15 Nov 2022 23:14:21 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC9CD31EF2; Tue, 15 Nov 2022 20:14:18 -0800 (PST) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3Ncm7026856; Wed, 16 Nov 2022 04:14:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=DZaqRBDQdjwv0XB4GveGfUUF/gpHvXgfa3E0Wxb3Tbk=; b=cyCS6uANQ22abyg9phMfFgKUTy3CpaXUj5VAcgX3cTEZbsoCBDPfNjrec2g6WRlzqf2G 8N3M1lz/M7X87H82kSx19icCEFOq0U8HX5C/xPt7JEp2dvIcy80O12TezdTfduFO/FIP fiNaGEGs5lQym+u5hYjoYM7QJmMiurOHODMIPrJ7LLjXfChIKfSTcxN1iZcYQp/RlX/n HUjDBgpy9ZYSM7nJd+4w+xvi7EXait7UQJrmRpPZUkpmrSWYb20Hh9hT6gLbM8lK1vQv PVGLo/VSxjk+CRMSR2jY3SlRPRGDJogtl3lqNazF3QkzcT8+HhfYUmUv31fhLT4Jl4Wt 6Q== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqwqgaaw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:01 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 9DE474B5CE; Wed, 16 Nov 2022 04:14:00 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 273A38065DB; Wed, 16 Nov 2022 04:14:00 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 06/24] crypto: x86/sm3 - limit FPU preemption Date: Tue, 15 Nov 2022 22:13:24 -0600 Message-Id: <20221116041342.3841-7-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: d93n2GK0Nadv8dJmFx5k8lFZAwdQ3ekB X-Proofpoint-GUID: d93n2GK0Nadv8dJmFx5k8lFZAwdQ3ekB X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 impostorscore=0 spamscore=0 suspectscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, causing: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 930ab34d906d ("crypto: x86/sm3 - add AVX assembly implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- v3 use while loop, static int --- arch/x86/crypto/sm3_avx_glue.c | 35 ++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 661b6f22ffcd..483aaed996ba 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -17,6 +17,9 @@ #include #include +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 11 * 1024; + asmlinkage void sm3_transform_avx(struct sm3_state *state, const u8 *data, int nblocks); @@ -25,8 +28,10 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, { struct sm3_state *sctx = shash_desc_ctx(desc); + BUILD_BUG_ON(bytes_per_fpu == 0); + if (!crypto_simd_usable() || - (sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) { + (sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) { sm3_update(sctx, data, len); return 0; } @@ -37,9 +42,16 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); - kernel_fpu_begin(); - sm3_base_do_update(desc, data, len, sm3_transform_avx); - kernel_fpu_end(); + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } return 0; } @@ -47,6 +59,8 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { + BUILD_BUG_ON(bytes_per_fpu == 0); + if (!crypto_simd_usable()) { struct sm3_state *sctx = shash_desc_ctx(desc); @@ -57,9 +71,18 @@ static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, return 0; } + while (len) { + unsigned int chunk = min(len, bytes_per_fpu); + + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } + kernel_fpu_begin(); - if (len) - sm3_base_do_update(desc, data, len, sm3_transform_avx); sm3_base_do_finalize(desc, sm3_transform_avx); kernel_fpu_end(); From patchwork Wed Nov 16 04:13:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FFC0C433FE for ; Wed, 16 Nov 2022 04:14:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231918AbiKPEO0 (ORCPT ); Tue, 15 Nov 2022 23:14:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231848AbiKPEOU (ORCPT ); Tue, 15 Nov 2022 23:14:20 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF25D31EEE; Tue, 15 Nov 2022 20:14:17 -0800 (PST) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG4DcFj023372; Wed, 16 Nov 2022 04:14:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ukwY7JIs/U1Bbbab6Ibms4gk638cl7c9kt9xB3XTx+A=; b=kxttb0P+LQ79UvpnOJTgvoNBaX+a399Fm4RabsyfNJgL/DQ1hu3D53f8gFqmawaX/CW5 /0uhDA/QWWIyEChrsiqXX9gQuBX6ei89re5gpt5kz6B46PBBZ408seJ2rMykvjZ1/zoK UQCHlW09ObeR3uWq5/9313MGb9erqRf6TsAWILfWPttfai/DYFkw8sTp/A5OWe2470D9 VWQgWSvPUmZasNsDSPB0P6j2TsDytFuAT/d7e2KbJtOH6SQ4hN6LeeAubnF5MRLJPAPB QrLhezUGNR7WmZiypRfwpmKl6p6ecDUDRN/fkyTtjPSuzhGjRCVUeyVMwjV2FFPgZL4h IA== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvrmng06w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:02 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id B62102EEE8; Wed, 16 Nov 2022 04:14:01 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 46F70808B9A; Wed, 16 Nov 2022 04:14:01 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 07/24] crypto: x86/ghash - use u8 rather than char Date: Tue, 15 Nov 2022 22:13:25 -0600 Message-Id: <20221116041342.3841-8-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: OztRO2yPCF9wMos6GXUqpT1CCWFMUx0i X-Proofpoint-ORIG-GUID: OztRO2yPCF9wMos6GXUqpT1CCWFMUx0i X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=986 lowpriorityscore=0 priorityscore=1501 mlxscore=0 suspectscore=0 phishscore=0 spamscore=0 adultscore=0 impostorscore=0 malwarescore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Use more consistent unambivalent types for the source and destination buffer pointer arguments. Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_asm.S | 4 ++-- arch/x86/crypto/ghash-clmulni-intel_glue.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S index 2bf871899920..c7b8542facee 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_asm.S +++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S @@ -88,7 +88,7 @@ SYM_FUNC_START_LOCAL(__clmul_gf128mul_ble) RET SYM_FUNC_END(__clmul_gf128mul_ble) -/* void clmul_ghash_mul(char *dst, const u128 *shash) */ +/* void clmul_ghash_mul(u8 *dst, const u128 *shash) */ SYM_FUNC_START(clmul_ghash_mul) FRAME_BEGIN movups (%rdi), DATA @@ -103,7 +103,7 @@ SYM_FUNC_START(clmul_ghash_mul) SYM_FUNC_END(clmul_ghash_mul) /* - * void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, + * void clmul_ghash_update(u8 *dst, const u8 *src, unsigned int srclen, * const u128 *shash); */ SYM_FUNC_START(clmul_ghash_update) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 1f1a95f3dd0c..e996627c6583 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -23,9 +23,9 @@ #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 -void clmul_ghash_mul(char *dst, const u128 *shash); +void clmul_ghash_mul(u8 *dst, const u128 *shash); -void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, +void clmul_ghash_update(u8 *dst, const u8 *src, unsigned int srclen, const u128 *shash); struct ghash_async_ctx { From patchwork Wed Nov 16 04:13:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF2E9C4332F for ; Wed, 16 Nov 2022 04:14:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231707AbiKPEO2 (ORCPT ); Tue, 15 Nov 2022 23:14:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231862AbiKPEOU (ORCPT ); Tue, 15 Nov 2022 23:14:20 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0872431EF0; Tue, 15 Nov 2022 20:14:17 -0800 (PST) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3Nn4B027960; Wed, 16 Nov 2022 04:14:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=yGY/aFR7vSy6xzempavBAqGCddr6XsRxi33pB4WDD80=; b=CQEq8NJsKhtbC5lUNvPFZZpCpE6OYxyPI0nMx52cSEfmoGpJJvCAYtGQ3R7cAHwfZry5 oYIMCDr6c5tfgDdi6ZmJH1uqP+Whlyja820pUkxQOIRUxbDYElH9rD5UnZzqkhl7ft66 XfD/ZVBO/AAtTXRVfrV6lVGuaGgSRvGJ5V4M6wQjSW1SG18EsIjBNvogf4HhBH7sFVTA 0VmYAx/meod39tugh437bugwcU4EBtZifAKVd9gVPAG3iPtph0w1JaELvgE95u1+ryu5 kG15cgnp3XE/X/5A0TYCUNqDA9pXSPNd+wBAL8B54Zs8IK+ixHnEaQa2mKP4hWyMrLB9 ew== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqwqgab2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:03 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id C59EC809F56; Wed, 16 Nov 2022 04:14:02 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 50CF78058DE; Wed, 16 Nov 2022 04:14:02 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 08/24] crypto: x86/ghash - restructure FPU context saving Date: Tue, 15 Nov 2022 22:13:26 -0600 Message-Id: <20221116041342.3841-9-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: lWHW-zZnaofSzYJgbZJ7691Oou1zTpew X-Proofpoint-GUID: lWHW-zZnaofSzYJgbZJ7691Oou1zTpew X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 impostorscore=0 spamscore=0 suspectscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index e996627c6583..22367e363d72 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -80,7 +80,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -91,10 +90,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end(); From patchwork Wed Nov 16 04:13:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B7AAC4332F for ; Wed, 16 Nov 2022 04:14:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232000AbiKPEOe (ORCPT ); Tue, 15 Nov 2022 23:14:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231803AbiKPEOY (ORCPT ); Tue, 15 Nov 2022 23:14:24 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D687131ED3; Tue, 15 Nov 2022 20:14:19 -0800 (PST) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3W9QC000909; Wed, 16 Nov 2022 04:14:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=u6FAcp+6tcSEs7tXNY3uZohFs8We5u80NlPyFDCPC14=; b=DItwDFUl/bic1tiBntP4bMAn2guCdy9fLGcCa7uN3s1IhXZzVXdvefc/EVOQ2ehkDeUQ P/gIoj01snvR+lvaXJC6+6f5DjJu1u7Rpv0GgHfCQVRmK1kvHLLyUtTt5tS2eQe4erbA vA0RCaRyUigcInzPwbAEJ+MOvzFL3Cd4CknTb5Rg74zqDoSluU1Og2veHmwnuY4zXiHK opcmVsR3CxvPauqdYSYaJBW3bS7MeMmbIWV5+rmseXiEdziax3iWTcc7S5mg5fpcPxPJ 9EF9mAh0FnH44EYvPVhxaD19R1p3Pu2ZPbFv6OUFB5nEmOQVXqEWOnoMZbfuYUsY90xy zw== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqy689qj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:04 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id D55C8809F55; Wed, 16 Nov 2022 04:14:03 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 5FAAA808B9A; Wed, 16 Nov 2022 04:14:03 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 09/24] crypto: x86/ghash - limit FPU preemption Date: Tue, 15 Nov 2022 22:13:27 -0600 Message-Id: <20221116041342.3841-10-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 6rRm_cgmG3Ha27Vq-sgF0mf9Hgjvp0OZ X-Proofpoint-GUID: 6rRm_cgmG3Ha27Vq-sgF0mf9Hgjvp0OZ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=999 bulkscore=0 phishscore=0 malwarescore=0 adultscore=0 suspectscore=0 impostorscore=0 spamscore=0 priorityscore=1501 lowpriorityscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- v3 change to static int, simplify while loop --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 28 +++++++++++++++------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 22367e363d72..0f24c3b23fd2 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -20,8 +20,11 @@ #include #include -#define GHASH_BLOCK_SIZE 16 -#define GHASH_DIGEST_SIZE 16 +#define GHASH_BLOCK_SIZE 16U +#define GHASH_DIGEST_SIZE 16U + +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 50 * 1024; void clmul_ghash_mul(u8 *dst, const u128 *shash); @@ -80,9 +83,11 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; + BUILD_BUG_ON(bytes_per_fpu < GHASH_BLOCK_SIZE); + if (dctx->bytes) { int n = min(srclen, dctx->bytes); - u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); + u8 *pos = dst + GHASH_BLOCK_SIZE - dctx->bytes; dctx->bytes -= n; srclen -= n; @@ -97,13 +102,18 @@ static int ghash_update(struct shash_desc *desc, } } - kernel_fpu_begin(); - clmul_ghash_update(dst, src, srclen, &ctx->shash); - kernel_fpu_end(); + while (srclen >= GHASH_BLOCK_SIZE) { + unsigned int chunk = min(srclen, bytes_per_fpu); + + kernel_fpu_begin(); + clmul_ghash_update(dst, src, chunk, &ctx->shash); + kernel_fpu_end(); + + src += chunk & ~(GHASH_BLOCK_SIZE - 1); + srclen -= chunk & ~(GHASH_BLOCK_SIZE - 1); + } - if (srclen & 0xf) { - src += srclen - (srclen & 0xf); - srclen &= 0xf; + if (srclen) { dctx->bytes = GHASH_BLOCK_SIZE - srclen; while (srclen--) *dst++ ^= *src++; From patchwork Wed Nov 16 04:13:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14C4AC4332F for ; Wed, 16 Nov 2022 04:14:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230083AbiKPEOV (ORCPT ); Tue, 15 Nov 2022 23:14:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231702AbiKPEOP (ORCPT ); Tue, 15 Nov 2022 23:14:15 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 720602D1DB; Tue, 15 Nov 2022 20:14:14 -0800 (PST) Received: from pps.filterd (m0134422.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG401G7021757; Wed, 16 Nov 2022 04:14:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=DBZSqkANZQQ0OruNgKmmCEGIHHi7mZUgVGLxPPk3P8k=; b=AUz+aGuzCA7f7EAWYy6mMP5TMq133ey6CcO/y7KePC293RXjthMsM7itHq9YQczbltyY gmNzgonVvC/cpXn3ByK6E7+WKTZ6y0BO2mkMtiGtyBFlvAX5d2ercfokLWYBFcHeFfJf GmoCF83mdgv5MVYdUFC2vLt7xZJop6SUrJEJYPsM/TZupQrBMuHF4PZ1CfARWKSIzscI LkQRQxdJbZZ/rWwg9nPL2OgeC7tzMx2pK7P21lS6ISkWmxRdRNKv8hesz6i7YIedaMvr js6y7DMoc1ksHhHkkL8rZoCuUgHqq9C7Cfakm8npDbH1NT0VcOto+Zn722ule2L1tssW 6Q== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvrew82we-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:05 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 0AE40806B77; Wed, 16 Nov 2022 04:14:05 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 83A7C808BA7; Wed, 16 Nov 2022 04:14:04 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 10/24] crypto: x86/poly - limit FPU preemption Date: Tue, 15 Nov 2022 22:13:28 -0600 Message-Id: <20221116041342.3841-11-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 56s8Ohy8TZ1WboXjT_ok7Gn6ijdkpJuE X-Proofpoint-GUID: 56s8Ohy8TZ1WboXjT_ok7Gn6ijdkpJuE X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 impostorscore=0 adultscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 phishscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Use a static const unsigned int for the limit of the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() rather than using the SZ_4K macro (which is a signed value), or a magic value of 4096U embedded in the C code. Use unsigned int rather than size_t for some of the arguments to avoid typecasting for the min() macro. Signed-off-by: Robert Elliott --- v3 use static int rather than macro, change to while loops rather than do/while loops --- arch/x86/crypto/nhpoly1305-avx2-glue.c | 11 +++++--- arch/x86/crypto/nhpoly1305-sse2-glue.c | 11 +++++--- arch/x86/crypto/poly1305_glue.c | 37 +++++++++++++++++--------- arch/x86/crypto/polyval-clmulni_glue.c | 8 ++++-- 4 files changed, 46 insertions(+), 21 deletions(-) diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 8ea5ab0f1ca7..f7dc9c563bb5 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -13,6 +13,9 @@ #include #include +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 337 * 1024; + asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -26,18 +29,20 @@ static void _nh_avx2(const u32 *key, const u8 *message, size_t message_len, static int nhpoly1305_avx2_update(struct shash_desc *desc, const u8 *src, unsigned int srclen) { + BUILD_BUG_ON(bytes_per_fpu == 0); + if (srclen < 64 || !crypto_simd_usable()) return crypto_nhpoly1305_update(desc, src, srclen); - do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + while (srclen) { + unsigned int n = min(srclen, bytes_per_fpu); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2); kernel_fpu_end(); src += n; srclen -= n; - } while (srclen); + } return 0; } diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index 2b353d42ed13..daffcc7019ad 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -13,6 +13,9 @@ #include #include +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 199 * 1024; + asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -26,18 +29,20 @@ static void _nh_sse2(const u32 *key, const u8 *message, size_t message_len, static int nhpoly1305_sse2_update(struct shash_desc *desc, const u8 *src, unsigned int srclen) { + BUILD_BUG_ON(bytes_per_fpu == 0); + if (srclen < 64 || !crypto_simd_usable()) return crypto_nhpoly1305_update(desc, src, srclen); - do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + while (srclen) { + unsigned int n = min(srclen, bytes_per_fpu); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2); kernel_fpu_end(); src += n; srclen -= n; - } while (srclen); + } return 0; } diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 1dfb8af48a3c..16831c036d71 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -15,20 +15,27 @@ #include #include +#define POLY1305_BLOCK_SIZE_MASK (~(POLY1305_BLOCK_SIZE - 1)) + +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 217 * 1024; + asmlinkage void poly1305_init_x86_64(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]); asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); -asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); @@ -86,14 +93,12 @@ static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]) poly1305_init_x86_64(ctx, key); } -static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, +static void poly1305_simd_blocks(void *ctx, const u8 *inp, unsigned int len, const u32 padbit) { struct poly1305_arch_internal *state = ctx; - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE || - SZ_4K % POLY1305_BLOCK_SIZE); + BUILD_BUG_ON(bytes_per_fpu < POLY1305_BLOCK_SIZE); if (!static_branch_likely(&poly1305_use_avx) || (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) || @@ -103,8 +108,14 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, return; } - do { - const size_t bytes = min_t(size_t, len, SZ_4K); + while (len) { + unsigned int bytes; + + if (len < POLY1305_BLOCK_SIZE) + bytes = len; + else + bytes = min(len, + bytes_per_fpu & POLY1305_BLOCK_SIZE_MASK); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) @@ -117,7 +128,7 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, len -= bytes; inp += bytes; - } while (len); + } } static void poly1305_simd_emit(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index b7664d018851..de1c908f7412 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -29,6 +29,9 @@ #define NUM_KEY_POWERS 8 +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 393 * 1024; + struct polyval_tfm_ctx { /* * These powers must be in the order h^8, ..., h^1. @@ -107,6 +110,8 @@ static int polyval_x86_update(struct shash_desc *desc, unsigned int nblocks; unsigned int n; + BUILD_BUG_ON(bytes_per_fpu < POLYVAL_BLOCK_SIZE); + if (dctx->bytes) { n = min(srclen, dctx->bytes); pos = dctx->buffer + POLYVAL_BLOCK_SIZE - dctx->bytes; @@ -123,8 +128,7 @@ static int polyval_x86_update(struct shash_desc *desc, } while (srclen >= POLYVAL_BLOCK_SIZE) { - /* Allow rescheduling every 4K bytes. */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + nblocks = min(srclen, bytes_per_fpu) / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; From patchwork Wed Nov 16 04:13:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625844 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E234C4332F for ; Wed, 16 Nov 2022 04:14:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231657AbiKPEOY (ORCPT ); Tue, 15 Nov 2022 23:14:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231812AbiKPEOT (ORCPT ); Tue, 15 Nov 2022 23:14:19 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C49A2CDCC; Tue, 15 Nov 2022 20:14:17 -0800 (PST) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3Ncm8026856; Wed, 16 Nov 2022 04:14:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=cTTNWZf0gsaOChSSwVwJsJGsUpzhNtRL8Yvh4fEQViw=; b=oCuOJ54erZ91/ohmsoPTEJnfhsBH1vjmh1haz6V7DtH9adN81WnwEZ9Bp+lnpyziuKTr 76WR0CsARrKK6EjpLsNa+IilhpmXSvAzfDdL9puh45aSwhSpogkAs7M/MB70t5wYX1wj VscpfEGI5LwWWXoW9bgT7wEzFA7VwV74/9w7Wbhb/dlbO3KM11ltNMQvu55xKrpilzuE OEWTH1C3FMwrFKK7EQuiiFWxXfZhYx8YKlCviIwyQont3q9oo+H/O73LAUgtgkxZfBC0 DHWX+sHlU2OVRdACoTBKKRufgjwfv0ueaVkb2Wt3H69ilTNiOle2yhb0KVcV3Afkb/k0 5g== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqwqgabd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:07 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 1BFFF4B5DC; Wed, 16 Nov 2022 04:14:06 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id A22BE808BA7; Wed, 16 Nov 2022 04:14:05 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 11/24] crypto: x86/aegis - limit FPU preemption Date: Tue, 15 Nov 2022 22:13:29 -0600 Message-Id: <20221116041342.3841-12-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: _Ez51FyL3s9E__eU_HvQXPq7PEMUgsSQ X-Proofpoint-GUID: _Ez51FyL3s9E__eU_HvQXPq7PEMUgsSQ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 impostorscore=0 spamscore=0 suspectscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Make kernel_fpu_begin() and kernel_fpu_end() calls around each assembly language function that uses FPU context, rather than around the entire set (init, ad, crypt, final). Limit the processing of bulk data based on a module parameter, so multiple blocks are processed within one FPU context (associated data is not limited). Allow the skcipher_walk functions to sleep again, since they are is no longer called inside FPU context. Motivation: calling crypto_aead_encrypt() with a single scatter-gather list entry pointing to a 1 MiB plaintext buffer caused the aesni_encrypt function to receive a length of 1048576 bytes and consume 306348 cycles within FPU context to process that data. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Fixes: ba6771c0a0bc ("crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEP") Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 39 ++++++++++++++++++++------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 4623189000d8..6e96bdda2811 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -23,6 +23,9 @@ #define AEGIS128_MIN_AUTH_SIZE 8 #define AEGIS128_MAX_AUTH_SIZE 16 +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */ +static const unsigned int bytes_per_fpu = 4 * 1024; + asmlinkage void crypto_aegis128_aesni_init(void *state, void *key, void *iv); asmlinkage void crypto_aegis128_aesni_ad( @@ -85,15 +88,19 @@ static void crypto_aegis128_aesni_process_ad( if (pos > 0) { unsigned int fill = AEGIS128_BLOCK_SIZE - pos; memcpy(buf.bytes + pos, src, fill); - crypto_aegis128_aesni_ad(state, + kernel_fpu_begin(); + crypto_aegis128_aesni_ad(state->blocks, AEGIS128_BLOCK_SIZE, buf.bytes); + kernel_fpu_end(); pos = 0; left -= fill; src += fill; } - crypto_aegis128_aesni_ad(state, left, src); + kernel_fpu_begin(); + crypto_aegis128_aesni_ad(state->blocks, left, src); + kernel_fpu_end(); src += left & ~(AEGIS128_BLOCK_SIZE - 1); left &= AEGIS128_BLOCK_SIZE - 1; @@ -110,7 +117,9 @@ static void crypto_aegis128_aesni_process_ad( if (pos > 0) { memset(buf.bytes + pos, 0, AEGIS128_BLOCK_SIZE - pos); - crypto_aegis128_aesni_ad(state, AEGIS128_BLOCK_SIZE, buf.bytes); + kernel_fpu_begin(); + crypto_aegis128_aesni_ad(state->blocks, AEGIS128_BLOCK_SIZE, buf.bytes); + kernel_fpu_end(); } } @@ -119,15 +128,23 @@ static void crypto_aegis128_aesni_process_crypt( const struct aegis_crypt_ops *ops) { while (walk->nbytes >= AEGIS128_BLOCK_SIZE) { - ops->crypt_blocks(state, - round_down(walk->nbytes, AEGIS128_BLOCK_SIZE), + unsigned int chunk = min(walk->nbytes, bytes_per_fpu); + + chunk = round_down(chunk, AEGIS128_BLOCK_SIZE); + + kernel_fpu_begin(); + ops->crypt_blocks(state->blocks, chunk, walk->src.virt.addr, walk->dst.virt.addr); - skcipher_walk_done(walk, walk->nbytes % AEGIS128_BLOCK_SIZE); + kernel_fpu_end(); + + skcipher_walk_done(walk, walk->nbytes - chunk); } if (walk->nbytes) { - ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr, + kernel_fpu_begin(); + ops->crypt_tail(state->blocks, walk->nbytes, walk->src.virt.addr, walk->dst.virt.addr); + kernel_fpu_end(); skcipher_walk_done(walk, 0); } } @@ -172,15 +189,17 @@ static void crypto_aegis128_aesni_crypt(struct aead_request *req, struct skcipher_walk walk; struct aegis_state state; - ops->skcipher_walk_init(&walk, req, true); + ops->skcipher_walk_init(&walk, req, false); kernel_fpu_begin(); + crypto_aegis128_aesni_init(&state.blocks, ctx->key.bytes, req->iv); + kernel_fpu_end(); - crypto_aegis128_aesni_init(&state, ctx->key.bytes, req->iv); crypto_aegis128_aesni_process_ad(&state, req->src, req->assoclen); crypto_aegis128_aesni_process_crypt(&state, &walk, ops); - crypto_aegis128_aesni_final(&state, tag_xor, req->assoclen, cryptlen); + kernel_fpu_begin(); + crypto_aegis128_aesni_final(&state.blocks, tag_xor, req->assoclen, cryptlen); kernel_fpu_end(); } From patchwork Wed Nov 16 04:13:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E180C4321E for ; Wed, 16 Nov 2022 04:14:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230223AbiKPEOh (ORCPT ); Tue, 15 Nov 2022 23:14:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231871AbiKPEOZ (ORCPT ); Tue, 15 Nov 2022 23:14:25 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A95F2D1DB; Tue, 15 Nov 2022 20:14:19 -0800 (PST) Received: from pps.filterd (m0148664.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3HohF024146; Wed, 16 Nov 2022 04:14:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=3DT2dGUsYD5z+1mTOCjd0II633gWE/migA0MCL9vm7A=; b=J6TmsNukmxNs6mv4sMcZTauF3u6VaXudN8vrc5T3O3BUe4M+p9l3iX1oMhUgQa4piVyz IU5a4hXeReJMWUw/Zym5gkESWAQW7KoLufesbIAVmDeiJEY85b2eYyvD0k+ovbH6kA1l X7qnecawY7tjiV/8ORJyuKYZJR4bPZ1ciG6bRtr6Odca/95GiIKbbvZD3MfcMwPQfL7W 6/lp/p1A0AYRm6JJ8WbdjY1u6GXFV+NsQ3xUsArApzYWQaztkaxleJ+swLc+c0tEOEft 865DwGJ1fKpL00dDzOWnuvLvsOJrgcH+JcOYxJp4CSRj3eK3/tglMNnq8VVc9NZaeh/Y zQ== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqturc3s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:08 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 4FF018040E8; Wed, 16 Nov 2022 04:14:07 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id BF1E580FE88; Wed, 16 Nov 2022 04:14:06 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott , kernel test robot Subject: [PATCH v4 12/24] crypto: x86/sha - register all variations Date: Tue, 15 Nov 2022 22:13:30 -0600 Message-Id: <20221116041342.3841-13-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: Qat-OnP-xdC88svxJY34uKIPYtgqUs8v X-Proofpoint-GUID: Qat-OnP-xdC88svxJY34uKIPYtgqUs8v X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 priorityscore=1501 suspectscore=0 bulkscore=0 clxscore=1015 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 phishscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Don't register and unregister each of the functions from least- to most-optimized (e.g., SSSE3 then AVX then AVX2); register all variations. This enables selecting those other algorithms if needed, such as for testing with: modprobe tcrypt mode=300 alg=sha512-avx modprobe tcrypt mode=400 alg=sha512-avx Suggested-by: Tim Chen Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- v3 register all the variations, not just the best one, per Herbert's feedback. return -ENODEV if none are successful, 0 if any are successful v4 remove driver_name strings that are only used by later patches no longer included in this series that enhance the prints. A future patch series might remove existing prints rather than add and enhance them. Reported-by: kernel test robot --- arch/x86/crypto/sha1_ssse3_glue.c | 132 +++++++++++++-------------- arch/x86/crypto/sha256_ssse3_glue.c | 136 +++++++++++++--------------- arch/x86/crypto/sha512_ssse3_glue.c | 99 +++++++++----------- 3 files changed, 168 insertions(+), 199 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 4bc77c84b0fb..e75a1060bb5f 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -34,6 +34,13 @@ static const unsigned int bytes_per_fpu_avx2 = 34 * 1024; static const unsigned int bytes_per_fpu_avx = 30 * 1024; static const unsigned int bytes_per_fpu_ssse3 = 26 * 1024; +static int using_x86_ssse3; +static int using_x86_avx; +static int using_x86_avx2; +#ifdef CONFIG_AS_SHA1_NI +static int using_x86_shani; +#endif + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, unsigned int bytes_per_fpu, sha1_block_fn *sha1_xform) @@ -128,17 +135,12 @@ static struct shash_alg sha1_ssse3_alg = { } }; -static int register_sha1_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shash(&sha1_ssse3_alg); - return 0; -} - static void unregister_sha1_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (using_x86_ssse3) { crypto_unregister_shash(&sha1_ssse3_alg); + using_x86_ssse3 = 0; + } } asmlinkage void sha1_transform_avx(struct sha1_state *state, @@ -179,28 +181,12 @@ static struct shash_alg sha1_avx_alg = { } }; -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} - -static int register_sha1_avx(void) -{ - if (avx_usable()) - return crypto_register_shash(&sha1_avx_alg); - return 0; -} - static void unregister_sha1_avx(void) { - if (avx_usable()) + if (using_x86_avx) { crypto_unregister_shash(&sha1_avx_alg); + using_x86_avx = 0; + } } #define SHA1_AVX2_BLOCK_OPTSIZE 4 /* optimal 4*64 bytes of SHA1 blocks */ @@ -208,16 +194,6 @@ static void unregister_sha1_avx(void) asmlinkage void sha1_transform_avx2(struct sha1_state *state, const u8 *data, int blocks); -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) - && boot_cpu_has(X86_FEATURE_BMI1) - && boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - static void sha1_apply_transform_avx2(struct sha1_state *state, const u8 *data, int blocks) { @@ -263,17 +239,12 @@ static struct shash_alg sha1_avx2_alg = { } }; -static int register_sha1_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shash(&sha1_avx2_alg); - return 0; -} - static void unregister_sha1_avx2(void) { - if (avx2_usable()) + if (using_x86_avx2) { crypto_unregister_shash(&sha1_avx2_alg); + using_x86_avx2 = 0; + } } #ifdef CONFIG_AS_SHA1_NI @@ -315,49 +286,70 @@ static struct shash_alg sha1_ni_alg = { } }; -static int register_sha1_ni(void) -{ - if (boot_cpu_has(X86_FEATURE_SHA_NI)) - return crypto_register_shash(&sha1_ni_alg); - return 0; -} - static void unregister_sha1_ni(void) { - if (boot_cpu_has(X86_FEATURE_SHA_NI)) + if (using_x86_shani) { crypto_unregister_shash(&sha1_ni_alg); + using_x86_shani = 0; + } } #else -static inline int register_sha1_ni(void) { return 0; } static inline void unregister_sha1_ni(void) { } #endif static int __init sha1_ssse3_mod_init(void) { - if (register_sha1_ssse3()) - goto fail; + const char *feature_name; + int ret; + +#ifdef CONFIG_AS_SHA1_NI + /* SHA-NI */ + if (boot_cpu_has(X86_FEATURE_SHA_NI)) { - if (register_sha1_avx()) { - unregister_sha1_ssse3(); - goto fail; + ret = crypto_register_shash(&sha1_ni_alg); + if (!ret) + using_x86_shani = 1; } +#endif + + /* AVX2 */ + if (boot_cpu_has(X86_FEATURE_AVX2)) { - if (register_sha1_avx2()) { - unregister_sha1_avx(); - unregister_sha1_ssse3(); - goto fail; + if (boot_cpu_has(X86_FEATURE_BMI1) && + boot_cpu_has(X86_FEATURE_BMI2)) { + + ret = crypto_register_shash(&sha1_avx2_alg); + if (!ret) + using_x86_avx2 = 1; + } } - if (register_sha1_ni()) { - unregister_sha1_avx2(); - unregister_sha1_avx(); - unregister_sha1_ssse3(); - goto fail; + /* AVX */ + if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + + ret = crypto_register_shash(&sha1_avx_alg); + if (!ret) + using_x86_avx = 1; + } } - return 0; -fail: + /* SSE3 */ + if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shash(&sha1_ssse3_alg); + if (!ret) + using_x86_ssse3 = 1; + } + +#ifdef CONFIG_AS_SHA1_NI + if (using_x86_shani) + return 0; +#endif + if (using_x86_avx2 || using_x86_avx || using_x86_ssse3) + return 0; return -ENODEV; } diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index cdcdf5a80ffe..c6261ede4bae 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -51,6 +51,13 @@ static const unsigned int bytes_per_fpu_ssse3 = 11 * 1024; asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); +static int using_x86_ssse3; +static int using_x86_avx; +static int using_x86_avx2; +#ifdef CONFIG_AS_SHA256_NI +static int using_x86_shani; +#endif + static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, unsigned int bytes_per_fpu, sha256_block_fn *sha256_xform) @@ -156,19 +163,13 @@ static struct shash_alg sha256_ssse3_algs[] = { { } } }; -static int register_sha256_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shashes(sha256_ssse3_algs, - ARRAY_SIZE(sha256_ssse3_algs)); - return 0; -} - static void unregister_sha256_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (using_x86_ssse3) { crypto_unregister_shashes(sha256_ssse3_algs, ARRAY_SIZE(sha256_ssse3_algs)); + using_x86_ssse3 = 0; + } } asmlinkage void sha256_transform_avx(struct sha256_state *state, @@ -223,30 +224,13 @@ static struct shash_alg sha256_avx_algs[] = { { } } }; -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} - -static int register_sha256_avx(void) -{ - if (avx_usable()) - return crypto_register_shashes(sha256_avx_algs, - ARRAY_SIZE(sha256_avx_algs)); - return 0; -} - static void unregister_sha256_avx(void) { - if (avx_usable()) + if (using_x86_avx) { crypto_unregister_shashes(sha256_avx_algs, ARRAY_SIZE(sha256_avx_algs)); + using_x86_avx = 0; + } } asmlinkage void sha256_transform_rorx(struct sha256_state *state, @@ -301,28 +285,13 @@ static struct shash_alg sha256_avx2_algs[] = { { } } }; -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) && - boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - -static int register_sha256_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shashes(sha256_avx2_algs, - ARRAY_SIZE(sha256_avx2_algs)); - return 0; -} - static void unregister_sha256_avx2(void) { - if (avx2_usable()) + if (using_x86_avx2) { crypto_unregister_shashes(sha256_avx2_algs, ARRAY_SIZE(sha256_avx2_algs)); + using_x86_avx2 = 0; + } } #ifdef CONFIG_AS_SHA256_NI @@ -378,51 +347,72 @@ static struct shash_alg sha256_ni_algs[] = { { } } }; -static int register_sha256_ni(void) -{ - if (boot_cpu_has(X86_FEATURE_SHA_NI)) - return crypto_register_shashes(sha256_ni_algs, - ARRAY_SIZE(sha256_ni_algs)); - return 0; -} - static void unregister_sha256_ni(void) { - if (boot_cpu_has(X86_FEATURE_SHA_NI)) + if (using_x86_shani) { crypto_unregister_shashes(sha256_ni_algs, ARRAY_SIZE(sha256_ni_algs)); + using_x86_shani = 0; + } } #else -static inline int register_sha256_ni(void) { return 0; } static inline void unregister_sha256_ni(void) { } #endif static int __init sha256_ssse3_mod_init(void) { - if (register_sha256_ssse3()) - goto fail; + const char *feature_name; + int ret; + +#ifdef CONFIG_AS_SHA256_NI + /* SHA-NI */ + if (boot_cpu_has(X86_FEATURE_SHA_NI)) { - if (register_sha256_avx()) { - unregister_sha256_ssse3(); - goto fail; + ret = crypto_register_shashes(sha256_ni_algs, + ARRAY_SIZE(sha256_ni_algs)); + if (!ret) + using_x86_shani = 1; } +#endif + + /* AVX2 */ + if (boot_cpu_has(X86_FEATURE_AVX2)) { - if (register_sha256_avx2()) { - unregister_sha256_avx(); - unregister_sha256_ssse3(); - goto fail; + if (boot_cpu_has(X86_FEATURE_BMI2)) { + ret = crypto_register_shashes(sha256_avx2_algs, + ARRAY_SIZE(sha256_avx2_algs)); + if (!ret) + using_x86_avx2 = 1; + } } - if (register_sha256_ni()) { - unregister_sha256_avx2(); - unregister_sha256_avx(); - unregister_sha256_ssse3(); - goto fail; + /* AVX */ + if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + ret = crypto_register_shashes(sha256_avx_algs, + ARRAY_SIZE(sha256_avx_algs)); + if (!ret) + using_x86_avx = 1; + } } - return 0; -fail: + /* SSE3 */ + if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shashes(sha256_ssse3_algs, + ARRAY_SIZE(sha256_ssse3_algs)); + if (!ret) + using_x86_ssse3 = 1; + } + +#ifdef CONFIG_AS_SHA256_NI + if (using_x86_shani) + return 0; +#endif + if (using_x86_avx2 || using_x86_avx || using_x86_ssse3) + return 0; return -ENODEV; } diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index c7036cfe2a7e..feae85933270 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -47,6 +47,10 @@ static const unsigned int bytes_per_fpu_ssse3 = 17 * 1024; asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); +static int using_x86_ssse3; +static int using_x86_avx; +static int using_x86_avx2; + static int sha512_update(struct shash_desc *desc, const u8 *data, unsigned int len, unsigned int bytes_per_fpu, sha512_block_fn *sha512_xform) @@ -152,33 +156,17 @@ static struct shash_alg sha512_ssse3_algs[] = { { } } }; -static int register_sha512_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shashes(sha512_ssse3_algs, - ARRAY_SIZE(sha512_ssse3_algs)); - return 0; -} - static void unregister_sha512_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (using_x86_ssse3) { crypto_unregister_shashes(sha512_ssse3_algs, ARRAY_SIZE(sha512_ssse3_algs)); + using_x86_ssse3 = 0; + } } asmlinkage void sha512_transform_avx(struct sha512_state *state, const u8 *data, int blocks); -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) @@ -230,19 +218,13 @@ static struct shash_alg sha512_avx_algs[] = { { } } }; -static int register_sha512_avx(void) -{ - if (avx_usable()) - return crypto_register_shashes(sha512_avx_algs, - ARRAY_SIZE(sha512_avx_algs)); - return 0; -} - static void unregister_sha512_avx(void) { - if (avx_usable()) + if (using_x86_avx) { crypto_unregister_shashes(sha512_avx_algs, ARRAY_SIZE(sha512_avx_algs)); + using_x86_avx = 0; + } } asmlinkage void sha512_transform_rorx(struct sha512_state *state, @@ -298,22 +280,6 @@ static struct shash_alg sha512_avx2_algs[] = { { } } }; -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) && - boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - -static int register_sha512_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shashes(sha512_avx2_algs, - ARRAY_SIZE(sha512_avx2_algs)); - return 0; -} static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), @@ -324,32 +290,53 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static void unregister_sha512_avx2(void) { - if (avx2_usable()) + if (using_x86_avx2) { crypto_unregister_shashes(sha512_avx2_algs, ARRAY_SIZE(sha512_avx2_algs)); + using_x86_avx2 = 0; + } } static int __init sha512_ssse3_mod_init(void) { + const char *feature_name; + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (register_sha512_ssse3()) - goto fail; + /* AVX2 */ + if (boot_cpu_has(X86_FEATURE_AVX2)) { + if (boot_cpu_has(X86_FEATURE_BMI2)) { + ret = crypto_register_shashes(sha512_avx2_algs, + ARRAY_SIZE(sha512_avx2_algs)); + if (!ret) + using_x86_avx2 = 1; + } + } + + /* AVX */ + if (boot_cpu_has(X86_FEATURE_AVX)) { - if (register_sha512_avx()) { - unregister_sha512_ssse3(); - goto fail; + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + ret = crypto_register_shashes(sha512_avx_algs, + ARRAY_SIZE(sha512_avx_algs)); + if (!ret) + using_x86_avx = 1; + } } - if (register_sha512_avx2()) { - unregister_sha512_avx(); - unregister_sha512_ssse3(); - goto fail; + /* SSE3 */ + if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shashes(sha512_ssse3_algs, + ARRAY_SIZE(sha512_ssse3_algs)); + if (!ret) + using_x86_ssse3 = 1; } - return 0; -fail: + if (using_x86_avx2 || using_x86_avx || using_x86_ssse3) + return 0; return -ENODEV; } From patchwork Wed Nov 16 04:13:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B5F7C433FE for ; Wed, 16 Nov 2022 04:14:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229567AbiKPEO3 (ORCPT ); Tue, 15 Nov 2022 23:14:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231135AbiKPEOV (ORCPT ); Tue, 15 Nov 2022 23:14:21 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A495B31EDF; Tue, 15 Nov 2022 20:14:17 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG31Mrn027550; Wed, 16 Nov 2022 04:14:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=78cE9OO8uzMM2db32hW7tetx3xycoj1j3qAj8neR0YQ=; b=pNrzYLZ3h+rBst1BTvYNT0AiB29y0JYgoF0wCZefTrpEd1CiMK3Ny654MC0MCxA4ZCZI F4h4gReZSH9gul8fWRSQBDwSQGBA2xaWQVrIceGZVLSZ2yzrRC88+RYaInz1f89kSotZ 3ZlxaJwuViu6KQu2yXlRG9SoMPIZD5ktyBSPQ6w91KIPVF52Rou1LvCBlu6uq7oHCC80 5DMShuENZS5dfXyuQn8cauW4pLB+II1i2WHFxPRS5nPa9+VxIheDkLLa656JyTaAPlD0 1xuOFOOL9w2oeRCyDenf/a4msm/JfJgJjhvt5iq7vXLa+lYpAmsh8mQu7oCnhshHjDiK xg== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqkbrefv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:09 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 712E98040CD; Wed, 16 Nov 2022 04:14:08 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 008A1802DD6; Wed, 16 Nov 2022 04:14:07 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 13/24] crypto: x86/sha - minimize time in FPU context Date: Tue, 15 Nov 2022 22:13:31 -0600 Message-Id: <20221116041342.3841-14-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: I1qUADll7Q5ufH1ja4mJxCxa-CcDUQNj X-Proofpoint-ORIG-GUID: I1qUADll7Q5ufH1ja4mJxCxa-CcDUQNj X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 adultscore=7 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Narrow the kernel_fpu_begin()/kernel_fpu_end() to just wrap the assembly functions, not any extra C code around them (which includes several memcpy() calls). This reduces unnecessary time in FPU context, in which the scheduler is prevented from preempting and the RCU subsystem is kept from doing its work. Example results measuring a boot, in which SHA-512 is used to check all module signatures using finup() calls: Before: calls maxcycles bpf update finup algorithm module ======== ============ ======== ======== ======== =========== ============== 168390 1233188 19456 0 19456 sha512-avx2 sha512_ssse3 After: 182694 1007224 19456 0 19456 sha512-avx2 sha512_ssse3 That means it stayed in FPU context for 226k fewer clocks cycles (which is 102 microseconds on this system, 18% less). Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 82 ++++++++++++++++++++--------- arch/x86/crypto/sha256_ssse3_glue.c | 67 ++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 48 ++++++++++++----- 3 files changed, 145 insertions(+), 52 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index e75a1060bb5f..32f3310e19e2 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -34,6 +34,54 @@ static const unsigned int bytes_per_fpu_avx2 = 34 * 1024; static const unsigned int bytes_per_fpu_avx = 30 * 1024; static const unsigned int bytes_per_fpu_ssse3 = 26 * 1024; +asmlinkage void sha1_transform_ssse3(struct sha1_state *state, + const u8 *data, int blocks); + +asmlinkage void sha1_transform_avx(struct sha1_state *state, + const u8 *data, int blocks); + +asmlinkage void sha1_transform_avx2(struct sha1_state *state, + const u8 *data, int blocks); + +#ifdef CONFIG_AS_SHA1_NI +asmlinkage void sha1_ni_transform(struct sha1_state *digest, const u8 *data, + int rounds); +#endif + +static void fpu_sha1_transform_ssse3(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_transform_ssse3(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha1_transform_avx(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_transform_avx(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha1_transform_avx2(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_transform_avx2(state, data, blocks); + kernel_fpu_end(); +} + +#ifdef CONFIG_AS_SHA1_NI +static void fpu_sha1_transform_shani(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_ni_transform(state, data, blocks); + kernel_fpu_end(); +} +#endif + static int using_x86_ssse3; static int using_x86_avx; static int using_x86_avx2; @@ -60,9 +108,7 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha1_base_do_update(desc, data, chunk, sha1_xform); - kernel_fpu_end(); len -= chunk; data += chunk; @@ -81,36 +127,29 @@ static int sha1_finup(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha1_base_do_update(desc, data, chunk, sha1_xform); - kernel_fpu_end(); len -= chunk; data += chunk; } - kernel_fpu_begin(); sha1_base_do_finalize(desc, sha1_xform); - kernel_fpu_end(); return sha1_base_finish(desc, out); } -asmlinkage void sha1_transform_ssse3(struct sha1_state *state, - const u8 *data, int blocks); - static int sha1_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha1_update(desc, data, len, bytes_per_fpu_ssse3, - sha1_transform_ssse3); + fpu_sha1_transform_ssse3); } static int sha1_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha1_finup(desc, data, len, bytes_per_fpu_ssse3, out, - sha1_transform_ssse3); + fpu_sha1_transform_ssse3); } /* Add padding and return the message digest. */ @@ -143,21 +182,18 @@ static void unregister_sha1_ssse3(void) } } -asmlinkage void sha1_transform_avx(struct sha1_state *state, - const u8 *data, int blocks); - static int sha1_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha1_update(desc, data, len, bytes_per_fpu_avx, - sha1_transform_avx); + fpu_sha1_transform_avx); } static int sha1_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha1_finup(desc, data, len, bytes_per_fpu_avx, out, - sha1_transform_avx); + fpu_sha1_transform_avx); } static int sha1_avx_final(struct shash_desc *desc, u8 *out) @@ -191,17 +227,14 @@ static void unregister_sha1_avx(void) #define SHA1_AVX2_BLOCK_OPTSIZE 4 /* optimal 4*64 bytes of SHA1 blocks */ -asmlinkage void sha1_transform_avx2(struct sha1_state *state, - const u8 *data, int blocks); - static void sha1_apply_transform_avx2(struct sha1_state *state, const u8 *data, int blocks) { /* Select the optimal transform based on data block size */ if (blocks >= SHA1_AVX2_BLOCK_OPTSIZE) - sha1_transform_avx2(state, data, blocks); + fpu_sha1_transform_avx2(state, data, blocks); else - sha1_transform_avx(state, data, blocks); + fpu_sha1_transform_avx(state, data, blocks); } static int sha1_avx2_update(struct shash_desc *desc, const u8 *data, @@ -248,21 +281,18 @@ static void unregister_sha1_avx2(void) } #ifdef CONFIG_AS_SHA1_NI -asmlinkage void sha1_ni_transform(struct sha1_state *digest, const u8 *data, - int rounds); - static int sha1_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha1_update(desc, data, len, bytes_per_fpu_shani, - sha1_ni_transform); + fpu_sha1_transform_shani); } static int sha1_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha1_finup(desc, data, len, bytes_per_fpu_shani, out, - sha1_ni_transform); + fpu_sha1_transform_shani); } static int sha1_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index c6261ede4bae..839da1b36273 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -51,6 +51,51 @@ static const unsigned int bytes_per_fpu_ssse3 = 11 * 1024; asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); +asmlinkage void sha256_transform_avx(struct sha256_state *state, + const u8 *data, int blocks); + +asmlinkage void sha256_transform_rorx(struct sha256_state *state, + const u8 *data, int blocks); + +#ifdef CONFIG_AS_SHA256_NI +asmlinkage void sha256_ni_transform(struct sha256_state *digest, + const u8 *data, int rounds); +#endif + +static void fpu_sha256_transform_ssse3(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_transform_ssse3(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha256_transform_avx(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_transform_avx(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha256_transform_avx2(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_transform_rorx(state, data, blocks); + kernel_fpu_end(); +} + +#ifdef CONFIG_AS_SHA1_NI +static void fpu_sha256_transform_shani(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_ni_transform(state, data, blocks); + kernel_fpu_end(); +} +#endif + static int using_x86_ssse3; static int using_x86_avx; static int using_x86_avx2; @@ -77,9 +122,7 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha256_base_do_update(desc, data, chunk, sha256_xform); - kernel_fpu_end(); len -= chunk; data += chunk; @@ -98,17 +141,13 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha256_base_do_update(desc, data, chunk, sha256_xform); - kernel_fpu_end(); len -= chunk; data += chunk; } - kernel_fpu_begin(); sha256_base_do_finalize(desc, sha256_xform); - kernel_fpu_end(); return sha256_base_finish(desc, out); } @@ -117,14 +156,14 @@ static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_ssse3, - sha256_transform_ssse3); + fpu_sha256_transform_ssse3); } static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_ssse3, - out, sha256_transform_ssse3); + out, fpu_sha256_transform_ssse3); } /* Add padding and return the message digest. */ @@ -179,14 +218,14 @@ static int sha256_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_avx, - sha256_transform_avx); + fpu_sha256_transform_avx); } static int sha256_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_avx, - out, sha256_transform_avx); + out, fpu_sha256_transform_avx); } static int sha256_avx_final(struct shash_desc *desc, u8 *out) @@ -240,14 +279,14 @@ static int sha256_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_avx2, - sha256_transform_rorx); + fpu_sha256_transform_avx2); } static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_avx2, - out, sha256_transform_rorx); + out, fpu_sha256_transform_avx2); } static int sha256_avx2_final(struct shash_desc *desc, u8 *out) @@ -302,14 +341,14 @@ static int sha256_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_shani, - sha256_ni_transform); + fpu_sha256_transform_shani); } static int sha256_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_shani, - out, sha256_ni_transform); + out, fpu_sha256_transform_shani); } static int sha256_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index feae85933270..48586ab40d55 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -47,6 +47,36 @@ static const unsigned int bytes_per_fpu_ssse3 = 17 * 1024; asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); +asmlinkage void sha512_transform_avx(struct sha512_state *state, + const u8 *data, int blocks); + +asmlinkage void sha512_transform_rorx(struct sha512_state *state, + const u8 *data, int blocks); + +static void fpu_sha512_transform_ssse3(struct sha512_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha512_transform_ssse3(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha512_transform_avx(struct sha512_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha512_transform_avx(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha512_transform_avx2(struct sha512_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha512_transform_rorx(state, data, blocks); + kernel_fpu_end(); +} + static int using_x86_ssse3; static int using_x86_avx; static int using_x86_avx2; @@ -70,9 +100,7 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha512_base_do_update(desc, data, chunk, sha512_xform); - kernel_fpu_end(); len -= chunk; data += chunk; @@ -91,17 +119,13 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha512_base_do_update(desc, data, chunk, sha512_xform); - kernel_fpu_end(); len -= chunk; data += chunk; } - kernel_fpu_begin(); sha512_base_do_finalize(desc, sha512_xform); - kernel_fpu_end(); return sha512_base_finish(desc, out); } @@ -110,14 +134,14 @@ static int sha512_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha512_update(desc, data, len, bytes_per_fpu_ssse3, - sha512_transform_ssse3); + fpu_sha512_transform_ssse3); } static int sha512_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha512_finup(desc, data, len, bytes_per_fpu_ssse3, - out, sha512_transform_ssse3); + out, fpu_sha512_transform_ssse3); } /* Add padding and return the message digest. */ @@ -172,14 +196,14 @@ static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha512_update(desc, data, len, bytes_per_fpu_avx, - sha512_transform_avx); + fpu_sha512_transform_avx); } static int sha512_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha512_finup(desc, data, len, bytes_per_fpu_avx, - out, sha512_transform_avx); + out, fpu_sha512_transform_avx); } /* Add padding and return the message digest. */ @@ -234,14 +258,14 @@ static int sha512_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha512_update(desc, data, len, bytes_per_fpu_avx2, - sha512_transform_rorx); + fpu_sha512_transform_avx2); } static int sha512_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha512_finup(desc, data, len, bytes_per_fpu_avx2, - out, sha512_transform_rorx); + out, fpu_sha512_transform_avx2); } /* Add padding and return the message digest. */ From patchwork Wed Nov 16 04:13:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625841 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB6B7C43217 for ; Wed, 16 Nov 2022 04:14:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231811AbiKPEOc (ORCPT ); Tue, 15 Nov 2022 23:14:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231868AbiKPEOY (ORCPT ); Tue, 15 Nov 2022 23:14:24 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58F2F31F84; Tue, 15 Nov 2022 20:14:19 -0800 (PST) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3MWSf016081; Wed, 16 Nov 2022 04:14:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=8ESuWlIrale+sXyV4fNctyjVH7uHSmRimCDCqLPwhog=; b=RBF1t4oGDoPGC+0xeLQ1EWG+ZHfJKiSOxXAophg7F+UmWGEzThZ7WftkUbuFNvyeoNr1 vB7AWaFoO+SdHX7c3pQ6vmRi87PPNYxwVzMC62JTidoa/IilwzZCgwuNMMlslroN3POR jzK+SVaksSzCTfgnJ5oMViD8BwHp8NudD9zVuis4r4g7Gy/vzcaWV3Mc4K7p+XhT7znx Jeq0V/djvbuDpSU0AxX9yLKj5oSi0nA3kXJNXcA7K879XrDG0J7Bc2WQ5vaxRJgnMzMO LChW/PmGJWwJv5pz9zJOmuEWxwHPrb+/k9UNCwueJNNMH5mzhdw/1AZRp8AcJbzDroXJ Cg== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqwa0ajs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:10 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 908C32EEF6; Wed, 16 Nov 2022 04:14:09 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 1E29880FE88; Wed, 16 Nov 2022 04:14:09 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 14/24] crypto: x86/sha - load based on CPU features Date: Tue, 15 Nov 2022 22:13:32 -0600 Message-Id: <20221116041342.3841-15-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 54hW-XXOxZ2aQxfGxRBb41DPGlIROLF3 X-Proofpoint-ORIG-GUID: 54hW-XXOxZ2aQxfGxRBb41DPGlIROLF3 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 adultscore=0 phishscore=0 suspectscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 mlxlogscore=999 spamscore=0 lowpriorityscore=0 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sha1, sha256 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. Signed-off-by: Robert Elliott --- v3 put device table SHA_NI entries inside CONFIG_SHAn_NI ifdefs, ensure builds properly with arch/x86/Kconfig.assembler changed to not set CONFIG_AS_SHA*_NI --- arch/x86/crypto/sha1_ssse3_glue.c | 15 +++++++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 32f3310e19e2..806463f57b6d 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -24,6 +24,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -328,11 +329,25 @@ static void unregister_sha1_ni(void) static inline void unregister_sha1_ni(void) { } #endif +static const struct x86_cpu_id module_cpu_ids[] = { +#ifdef CONFIG_AS_SHA1_NI + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), +#endif + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sha1_ssse3_mod_init(void) { const char *feature_name; int ret; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + #ifdef CONFIG_AS_SHA1_NI /* SHA-NI */ if (boot_cpu_has(X86_FEATURE_SHA_NI)) { diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 839da1b36273..30c8c50c1123 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -38,6 +38,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -399,11 +400,25 @@ static void unregister_sha256_ni(void) static inline void unregister_sha256_ni(void) { } #endif +static const struct x86_cpu_id module_cpu_ids[] = { +#ifdef CONFIG_AS_SHA256_NI + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), +#endif + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sha256_ssse3_mod_init(void) { const char *feature_name; int ret; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + #ifdef CONFIG_AS_SHA256_NI /* SHA-NI */ if (boot_cpu_has(X86_FEATURE_SHA_NI)) { From patchwork Wed Nov 16 04:13:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625840 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BDD5C43217 for ; Wed, 16 Nov 2022 04:14:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231357AbiKPEOg (ORCPT ); Tue, 15 Nov 2022 23:14:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231834AbiKPEOZ (ORCPT ); Tue, 15 Nov 2022 23:14:25 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A28E31DEC; Tue, 15 Nov 2022 20:14:21 -0800 (PST) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3RlJv014105; Wed, 16 Nov 2022 04:14:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=6SZmW+gWgFMHc7MDMy56h8QMVAgK51dtPijJjTSczbQ=; b=nvNEjlnSt/zEG6vEC+p5CU74HwPGCMynUh3pRF0tdqv09LK8BZDuit0odSNONEkyhBDv 0KPA2FI7Q2XecXCoz28zVLRsxf98gZi5t8UjG6WQ8n6fibRzvilrjz7kXOMUU8F+Rmqt Ak72nShCrcY3HR9lRhEbvpK31KqgGNxVa8cZ+IIpj6ct8fONYVMsYNsurzhYRxEf8E8q +miA/EzZ6iYvghZFxKOkWkhgS5va5IqGdRzYH4Qz/DcqRAjDOaBynU8i0JouQlbo+JjM OWiZMyxWD92HnHqxZwWxg6MZPynGha1T6SHU2BgXrP2P1Lrj4x/vwvgmLgmZ2xHDpxxa hg== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqyfr9jn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:11 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id EFC37809F54; Wed, 16 Nov 2022 04:14:10 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 79FA88058DE; Wed, 16 Nov 2022 04:14:10 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 15/24] crypto: x86/crc - load based on CPU features Date: Tue, 15 Nov 2022 22:13:33 -0600 Message-Id: <20221116041342.3841-16-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: JLbth3PE-M5wIDPE3_GTv_qHBXz7tZW6 X-Proofpoint-ORIG-GUID: JLbth3PE-M5wIDPE3_GTv_qHBXz7tZW6 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 lowpriorityscore=0 phishscore=0 adultscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: crc32, crc32c, and crct10dif Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. Remove the print on a device table mismatch from crc32 that is not present in the other modules. Modules are not supposed to print unless they are active. Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_glue.c | 10 ++++------ arch/x86/crypto/crc32c-intel_glue.c | 6 +++--- arch/x86/crypto/crct10dif-pclmul_glue.c | 6 +++--- 3 files changed, 10 insertions(+), 12 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index df3dbc754818..d5e889c24bea 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -182,20 +182,18 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crc32pclmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crc32pclmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32_pclmul_mod_init(void) { - - if (!x86_match_cpu(crc32pclmul_cpu_id)) { - pr_info("PCLMULQDQ-NI instructions are not detected.\n"); + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - } + return crypto_register_shash(&alg); } diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index f08ed68ec93d..aff132e925ea 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -240,15 +240,15 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crc32c_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_XMM4_2, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crc32c_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32c_intel_mod_init(void) { - if (!x86_match_cpu(crc32c_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) { diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 4f6b8c727d88..a26dbd27da96 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -139,15 +139,15 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crct10dif_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crct10dif_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crct10dif_intel_mod_init(void) { - if (!x86_match_cpu(crct10dif_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; return crypto_register_shash(&alg); From patchwork Wed Nov 16 04:13:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C0CEC433FE for ; Wed, 16 Nov 2022 04:14:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232055AbiKPEOp (ORCPT ); Tue, 15 Nov 2022 23:14:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231912AbiKPEO0 (ORCPT ); Tue, 15 Nov 2022 23:14:26 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63C8931FA7; Tue, 15 Nov 2022 20:14:22 -0800 (PST) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG4CaG2022245; Wed, 16 Nov 2022 04:14:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=D3DfJIKsHFauPuLg8l8zt1cOt0x3vTUhVteBJZCL2WE=; b=I3h47jtyqI1/uAdnG6Dz8OOmpP4hcRv8bH3lW9oFuC6iVijpCQUTG+5UNJyBGHxfHcSZ trzNAFvQWAcHVqoO7W/b0Hw3qNCF17MTYuMm/EhEwSND/wwqiqlEB/r3m6exupj63cpZ nebvWBO6rO7VPl8WJ1/+0LsXMTSONzTjG1DPQjiHPZxa34U7MLIiiSiyS+ICtnji2UXY w4L9FNtVa3NvLgh9XsLOgtIOhwfbtA6c2429DmF2hyi9TijCpVPRcxFZPcGFxYOmAiU2 FPwzneU8cvjqRla8oJ5fwEV0nGbgY4K5yBBG5fRSExpx+lddMlHnvthY5FSeeacqcE8n Lg== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvrmng09e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:13 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 944162EECF; Wed, 16 Nov 2022 04:14:12 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 1F7FD808B9A; Wed, 16 Nov 2022 04:14:12 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 16/24] crypto: x86/sm3 - load based on CPU features Date: Tue, 15 Nov 2022 22:13:34 -0600 Message-Id: <20221116041342.3841-17-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 0lkMSIZrD6OBxDlNv6JZ2MeDy4pF3oRr X-Proofpoint-ORIG-GUID: 0lkMSIZrD6OBxDlNv6JZ2MeDy4pF3oRr X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 lowpriorityscore=0 priorityscore=1501 mlxscore=0 suspectscore=0 phishscore=0 spamscore=0 adultscore=0 impostorscore=0 malwarescore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sm3 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. Signed-off-by: Robert Elliott --- v4 removed second AVX check that is unreachable --- arch/x86/crypto/sm3_avx_glue.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 483aaed996ba..c7786874319c 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -119,14 +120,18 @@ static struct shash_alg sm3_avx_alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sm3_avx_mod_init(void) { const char *feature_name; - if (!boot_cpu_has(X86_FEATURE_AVX)) { - pr_info("AVX instruction are not detected.\n"); + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - } if (!boot_cpu_has(X86_FEATURE_BMI2)) { pr_info("BMI2 instruction are not detected.\n"); From patchwork Wed Nov 16 04:13:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2E00C4332F for ; Wed, 16 Nov 2022 04:15:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232195AbiKPEO7 (ORCPT ); Tue, 15 Nov 2022 23:14:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229495AbiKPEOc (ORCPT ); Tue, 15 Nov 2022 23:14:32 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 561163205E; Tue, 15 Nov 2022 20:14:25 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG3Zd6V015695; Wed, 16 Nov 2022 04:14:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=cTzqPQFdhTvUVOcsBjcnPT2NWlxFqM19tqppsEYjuYM=; b=gG3oz65q2sA8UpQnUKLlfZu9X3sSK8E6WnisyHzRLbbdryTKn++4uqJg2Q21hi80yHYl 6T5GDkhbneJerWV5h27PFn0c176350ZaY8kI886qTKzqU7/aKdIaVHrtEwarGm+TwF9u zWLXpyvfIUf9cMS70MU6xv49SkIYoL/lFD2JNuc91i+UHeEdTZljrp3rul/cS3SvLLtY ISM1d5ZcxPrGMLXlQgPRocsZPer0gftY1flBACb6mrxoS6m4/U8COWf/XJ05n4pcl8OK 7CN+EmjMQxj6h9MUJHOtl5rY4+1fzxdtKS+C5UJuJlaqxlmB4R+dBRHNmadpVeGtXAVP XQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqkbreg5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:14 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 0A42A4B5C9; Wed, 16 Nov 2022 04:14:14 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 8E539808BA4; Wed, 16 Nov 2022 04:14:13 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 17/24] crypto: x86/poly - load based on CPU features Date: Tue, 15 Nov 2022 22:13:35 -0600 Message-Id: <20221116041342.3841-18-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: SD-tb7NqfoFHqzT8wvROtWvqBdLg81oS X-Proofpoint-ORIG-GUID: SD-tb7NqfoFHqzT8wvROtWvqBdLg81oS X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: nhpoly1305 poly1305 polyval Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. Remove the __maybe_unused attribute from polyval since it is always used. Signed-off-by: Robert Elliott --- v4 Removed CPU feature checks that are unreachable because the x86_match_cpu call already handles them. Made poly1305 match on all features since it does provide an x86_64 asm function if avx, avx2, and avx512f are not available. Move polyval into this patch rather than pair with ghash. Remove __maybe_unused from polyval. --- arch/x86/crypto/nhpoly1305-avx2-glue.c | 13 +++++++++++-- arch/x86/crypto/nhpoly1305-sse2-glue.c | 9 ++++++++- arch/x86/crypto/poly1305_glue.c | 10 ++++++++++ arch/x86/crypto/polyval-clmulni_glue.c | 6 +++--- 4 files changed, 32 insertions(+), 6 deletions(-) diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index f7dc9c563bb5..fa415fec5793 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -60,10 +61,18 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { - if (!boot_cpu_has(X86_FEATURE_AVX2) || - !boot_cpu_has(X86_FEATURE_OSXSAVE)) + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_OSXSAVE)) return -ENODEV; return crypto_register_shash(&nhpoly1305_alg); diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index daffcc7019ad..c47765e46236 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -60,9 +61,15 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { - if (!boot_cpu_has(X86_FEATURE_XMM2)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; return crypto_register_shash(&nhpoly1305_alg); diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 16831c036d71..f1e39e23b2a3 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -268,8 +269,17 @@ static struct shash_alg alg = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init poly1305_simd_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_AVX) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) static_branch_enable(&poly1305_use_avx); diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index de1c908f7412..b98e32f8e2a4 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -176,15 +176,15 @@ static struct shash_alg polyval_alg = { }, }; -__maybe_unused static const struct x86_cpu_id pcmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init polyval_clmulni_mod_init(void) { - if (!x86_match_cpu(pcmul_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX)) From patchwork Wed Nov 16 04:13:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625836 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4017CC433FE for ; Wed, 16 Nov 2022 04:16:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232550AbiKPEQm (ORCPT ); Tue, 15 Nov 2022 23:16:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232126AbiKPEO4 (ORCPT ); Tue, 15 Nov 2022 23:14:56 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02E1131EF2; Tue, 15 Nov 2022 20:14:29 -0800 (PST) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG4CbvL022259; Wed, 16 Nov 2022 04:14:17 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=4jOIa0LKBXHh2Ll5Fgpe6Cbe58VzxCQ8HYLa+/r8boU=; b=nxQ1HgbNaQZ0NUphmOGWelg5sN6A15ZiJGA8Jh8h8pyHkU2ENSkM+qi2ZcnfmNhxTIte lawV4R+MdIwMBEPWi3R5sPK3FPS1kZ0p3WunbzrX/wqb52k/sHCjlPKfHbTqoerqgpF/ WQlV2HvR2nd3nD8ELmhSb63BpZT5w3AEHgYcfGmra5ABXlO0D/a3v34zdarI75AhXCH6 P4lwrZl5R1nKcGCX9V/Uo8l5sGZi3YUGyuXRcRQb3LIWos5mjgqKWPtQQXPldmSkifpo FL/bQBBe5XtJESiMWixDjawiyrlBJTepuMNqwpyqJJI0l+vzkgqQ6eA1l18brN5JXSnu EQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvrmng09p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:17 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 4A03D4B5E0; Wed, 16 Nov 2022 04:14:15 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id C1C548058DE; Wed, 16 Nov 2022 04:14:14 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 18/24] crypto: x86/ghash - load based on CPU features Date: Tue, 15 Nov 2022 22:13:36 -0600 Message-Id: <20221116041342.3841-19-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: flpUPOjh0lI0Hb5cDcS50prqvbbal9Qx X-Proofpoint-ORIG-GUID: flpUPOjh0lI0Hb5cDcS50prqvbbal9Qx X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 lowpriorityscore=0 priorityscore=1501 mlxscore=0 suspectscore=0 phishscore=0 spamscore=0 adultscore=0 impostorscore=0 malwarescore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: ghash Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. Signed-off-by: Robert Elliott --- v4 move polyval into a separate patch --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 0f24c3b23fd2..d19a8e9b34a6 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -325,17 +325,17 @@ static struct ahash_alg ghash_async_alg = { }, }; -static const struct x86_cpu_id pcmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), /* Pickle-Mickle-Duck */ {} }; -MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init ghash_pclmulqdqni_mod_init(void) { int err; - if (!x86_match_cpu(pcmul_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; err = crypto_register_shash(&ghash_alg); From patchwork Wed Nov 16 04:13:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 426F0C43217 for ; Wed, 16 Nov 2022 04:16:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232083AbiKPEQd (ORCPT ); Tue, 15 Nov 2022 23:16:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231821AbiKPEOy (ORCPT ); Tue, 15 Nov 2022 23:14:54 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAF7932075; Tue, 15 Nov 2022 20:14:27 -0800 (PST) Received: from pps.filterd (m0134422.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG405Sx022067; Wed, 16 Nov 2022 04:14:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=9Vrn3Lcfdc63P//BKRWW0n5WCD4BkxzoQc4vkqFvrJ4=; b=g12To9Hgp8+ki2cPD7TZU/b3mY4ZW1tdICF781eb0lhMK+fvt/53XROzdTgmP4fP5cpQ mGtSpmERKASB9Fx9t4AoZMmZQNMsETkLrZuihBLbQEkv6Igp5XMegbCvszOCaISfjyQA HZ479s0hJhJ4Sb3ouriHsqX3JYXMMxuW8UwLehGnoCc/prgPLxCT79ZlL9f1qsMyH1cO CnEd2ld8EuS6xDMLRAtfc8yORkKHmSaMIRY93WcyWMl7sZPs1pPh4K2QIN02gGz+WSWD 26vncZIURuEMWekJoncGtrAW7thheacC+QEhoPHNck+Y3xz2FocTarB2ZtupoLCipW1O tg== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kvrew82xf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:18 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id CC405809F55; Wed, 16 Nov 2022 04:14:17 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 557EA80FE88; Wed, 16 Nov 2022 04:14:17 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 20/24] crypto: x86/ciphers - load based on CPU features Date: Tue, 15 Nov 2022 22:13:38 -0600 Message-Id: <20221116041342.3841-21-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: veUzpVFdiJqBLXq4d6paoFfIICqaJzCv X-Proofpoint-GUID: veUzpVFdiJqBLXq4d6paoFfIICqaJzCv X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 impostorscore=0 adultscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 phishscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases based on CPU feature bits for modules not implementing hash algorithms: aegis, aesni, aria blake2s, blowfish camellia, cast5, cast6, chacha, curve25519 des3_ede serpent, sm4 twofish Signed-off-by: Robert Elliott --- v4 Remove CPU feature checks that are unreachable because x86_match_cpu already handles them. Make curve25519 match on ADX and check BMI2. --- arch/x86/crypto/aegis128-aesni-glue.c | 10 +++++++++- arch/x86/crypto/aesni-intel_glue.c | 6 +++--- arch/x86/crypto/aria_aesni_avx_glue.c | 15 ++++++++++++--- arch/x86/crypto/blake2s-glue.c | 12 +++++++++++- arch/x86/crypto/blowfish_glue.c | 10 ++++++++++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 17 +++++++++++++---- arch/x86/crypto/camellia_aesni_avx_glue.c | 15 ++++++++++++--- arch/x86/crypto/camellia_glue.c | 10 ++++++++++ arch/x86/crypto/cast5_avx_glue.c | 10 ++++++++++ arch/x86/crypto/cast6_avx_glue.c | 10 ++++++++++ arch/x86/crypto/chacha_glue.c | 11 +++++++++-- arch/x86/crypto/curve25519-x86_64.c | 19 ++++++++++++++----- arch/x86/crypto/des3_ede_glue.c | 10 ++++++++++ arch/x86/crypto/serpent_avx2_glue.c | 14 ++++++++++++-- arch/x86/crypto/serpent_avx_glue.c | 10 ++++++++++ arch/x86/crypto/serpent_sse2_glue.c | 11 ++++++++--- arch/x86/crypto/sm4_aesni_avx2_glue.c | 13 +++++++++++-- arch/x86/crypto/sm4_aesni_avx_glue.c | 15 ++++++++++++--- arch/x86/crypto/twofish_avx_glue.c | 10 ++++++++++ arch/x86/crypto/twofish_glue.c | 10 ++++++++++ arch/x86/crypto/twofish_glue_3way.c | 10 ++++++++++ 21 files changed, 216 insertions(+), 32 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 6e96bdda2811..a3ebd018953c 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -282,12 +282,20 @@ static struct aead_alg crypto_aegis128_aesni_alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_aead_alg *simd_alg; static int __init crypto_aegis128_aesni_module_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2) || - !boot_cpu_has(X86_FEATURE_AES) || !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) return -ENODEV; diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index 921680373855..0505d4f9d2a2 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -1228,17 +1228,17 @@ static struct aead_alg aesni_aeads[0]; static struct simd_aead_alg *aesni_simd_aeads[ARRAY_SIZE(aesni_aeads)]; -static const struct x86_cpu_id aesni_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init aesni_init(void) { int err; - if (!x86_match_cpu(aesni_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_AVX2)) { diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index c561ea4fefa5..6a135203a767 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -5,6 +5,7 @@ * Copyright (c) 2022 Taehee Yoo */ +#include #include #include #include @@ -165,14 +166,22 @@ static struct skcipher_alg aria_algs[] = { static struct simd_skcipher_alg *aria_simd_algs[ARRAY_SIZE(aria_algs)]; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init aria_avx_init(void) { const char *feature_name; - if (!boot_cpu_has(X86_FEATURE_AVX) || - !boot_cpu_has(X86_FEATURE_AES) || + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("AES or OSXSAVE instructions are not detected.\n"); return -ENODEV; } diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index aaba21230528..df757d18a35a 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -10,7 +10,7 @@ #include #include #include - +#include #include #include #include @@ -55,8 +55,18 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, } EXPORT_SYMBOL(blake2s_compress); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX512VL, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init blake2s_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_SSSE3)) static_branch_enable(&blake2s_use_ssse3); diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 019c64c1340a..4c0ead71b198 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include /* regular block cipher functions */ asmlinkage void __blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src, @@ -303,10 +304,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init blowfish_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "blowfish-x86_64: performance on this CPU " diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index e7e4d64e9577..6c48fc9f3fde 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "camellia.h" #include "ecb_cbc_helpers.h" @@ -98,17 +99,25 @@ static struct skcipher_alg camellia_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; - if (!boot_cpu_has(X86_FEATURE_AVX) || - !boot_cpu_has(X86_FEATURE_AVX2) || - !boot_cpu_has(X86_FEATURE_AES) || + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_AES) || + !boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 or AES-NI instructions are not detected.\n"); + pr_info("AES-NI, AVX, or OSXSAVE instructions are not detected.\n"); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index c7ccf63e741e..6d7fc96d242e 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "camellia.h" #include "ecb_cbc_helpers.h" @@ -98,16 +99,24 @@ static struct skcipher_alg camellia_algs[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; - if (!boot_cpu_has(X86_FEATURE_AVX) || - !boot_cpu_has(X86_FEATURE_AES) || + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("AES-NI or OSXSAVE instructions are not detected.\n"); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index d45e9c0c42ac..a3df1043ed73 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -8,6 +8,7 @@ * Copyright (C) 2006 NTT (Nippon Telegraph and Telephone Corporation) */ +#include #include #include #include @@ -1377,10 +1378,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init camellia_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "camellia-x86_64: performance on this CPU " diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index 3976a87f92ad..bdc3c763334c 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "ecb_cbc_helpers.h" @@ -93,12 +94,21 @@ static struct skcipher_alg cast5_algs[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *cast5_simd_algs[ARRAY_SIZE(cast5_algs)]; static int __init cast5_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index 7e2aea372349..addca34b3511 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "ecb_cbc_helpers.h" @@ -93,12 +94,21 @@ static struct skcipher_alg cast6_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *cast6_simd_algs[ARRAY_SIZE(cast6_algs)]; static int __init cast6_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7b3a1cf0984b..546ab0abf30c 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, @@ -276,10 +277,16 @@ static struct skcipher_alg algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init chacha_simd_mod_init(void) { - if (!boot_cpu_has(X86_FEATURE_SSSE3)) - return 0; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; static_branch_enable(&chacha_use_simd); diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index d55fa9e9b9e6..ae7536b17bf9 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -12,7 +12,7 @@ #include #include #include - +#include #include #include @@ -1697,13 +1697,22 @@ static struct kpp_alg curve25519_alg = { .max_size = curve25519_max_size, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ADX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init curve25519_mod_init(void) { - if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) - static_branch_enable(&curve25519_use_bmi2_adx); - else - return 0; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_BMI2)) + return -ENODEV; + + static_branch_enable(&curve25519_use_bmi2_adx); + return IS_REACHABLE(CONFIG_CRYPTO_KPP) ? crypto_register_kpp(&curve25519_alg) : 0; } diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index abb8b1fe123b..168cac5c6ca6 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include struct des3_ede_x86_ctx { struct des3_ede_ctx enc; @@ -354,10 +355,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init des3_ede_x86_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { pr_info("des3_ede-x86_64: performance on this CPU would be suboptimal: disabling des3_ede-x86_64.\n"); return -ENODEV; diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 347e97f4b713..bc18149fb928 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "serpent-avx.h" #include "ecb_cbc_helpers.h" @@ -94,14 +95,23 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_avx2_init(void) { const char *feature_name; - if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 instructions are not detected.\n"); + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_OSXSAVE)) { + pr_info("OSXSAVE instructions are not detected.\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 6c248e1ea4ef..0db18d99da50 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "serpent-avx.h" #include "ecb_cbc_helpers.h" @@ -100,12 +101,21 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index d78f37e9b2cf..74f0c89f55ef 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "serpent-sse2.h" #include "ecb_cbc_helpers.h" @@ -103,14 +104,18 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_sse2_init(void) { - if (!boot_cpu_has(X86_FEATURE_XMM2)) { - printk(KERN_INFO "SSE2 instructions are not detected.\n"); + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - } return simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 84bc718f49a3..125b00db89b1 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -126,6 +127,12 @@ static struct skcipher_alg sm4_aesni_avx2_skciphers[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg * simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)]; @@ -133,11 +140,13 @@ static int __init sm4_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || - !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 or AES-NI instructions are not detected.\n"); + pr_info("AVX, AES-NI, and/or OSXSAVE instructions are not detected.\n"); return -ENODEV; } diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 7800f77d68ad..ac8182b197cf 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -445,6 +446,12 @@ static struct skcipher_alg sm4_aesni_avx_skciphers[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg * simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)]; @@ -452,10 +459,12 @@ static int __init sm4_init(void) { const char *feature_name; - if (!boot_cpu_has(X86_FEATURE_AVX) || - !boot_cpu_has(X86_FEATURE_AES) || + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("AES-NI or OSXSAVE instructions are not detected.\n"); return -ENODEV; } diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 3eb3440b477a..4657e6efc35d 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "twofish.h" #include "ecb_cbc_helpers.h" @@ -103,12 +104,21 @@ static struct skcipher_alg twofish_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *twofish_simd_algs[ARRAY_SIZE(twofish_algs)]; static int __init twofish_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); return -ENODEV; diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index f9c4adc27404..ade98aef3402 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -43,6 +43,7 @@ #include #include #include +#include asmlinkage void twofish_enc_blk(struct twofish_ctx *ctx, u8 *dst, const u8 *src); @@ -81,8 +82,17 @@ static struct crypto_alg alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init twofish_glue_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + return crypto_register_alg(&alg); } diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 90454cf18e0d..790e5a59a9a7 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "twofish.h" #include "ecb_cbc_helpers.h" @@ -140,8 +141,17 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init twofish_3way_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "twofish-x86_64-3way: performance on this CPU " From patchwork Wed Nov 16 04:13:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 625835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09170C433FE for ; Wed, 16 Nov 2022 04:16:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231640AbiKPEQp (ORCPT ); Tue, 15 Nov 2022 23:16:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232134AbiKPEO4 (ORCPT ); Tue, 15 Nov 2022 23:14:56 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1A8C31ED3; Tue, 15 Nov 2022 20:14:29 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AG31FuU027400; Wed, 16 Nov 2022 04:14:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=fUEoWBalDn15t5cghIYwbOp4kr7w2eJcKg0SlIogXXk=; b=j/FVsITL3XCoJxa42FwyW4IUVUWbCaYUqCnslWOqvkZKz/uWmcMQjnbeMhGUDPVx1rfj WpqScT+88gC9OIempR+TuEiV3z6Z3Xcx7jIRmvCoO+yMwSF4lcDpAcHUX2A4VQyJKSv6 mCkDwpLpdvYRRD97c91v5xDEajKIxb847PL/R/4IJkZIZr6ZoiO4O8fZpdZyDmlHhfyQ Tmr3I5zQalhn9zOhx82HadVFirJTkhccBSwjmGP/PlEL5VATDLEmh1WSulKre3AspVmK ps+nvZqqxSnNQpXQQ6v/3pCwPcos9R9hiilEspFwPL5edK7r2ovK8nh+3ob+Gc7Q7ZT3 Hw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3kvqkbregk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 04:14:20 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 318412EEEE; Wed, 16 Nov 2022 04:14:20 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id B379180FE95; Wed, 16 Nov 2022 04:14:19 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v4 22/24] crypto: x86 - report missing CPU features via module parameters Date: Tue, 15 Nov 2022 22:13:40 -0600 Message-Id: <20221116041342.3841-23-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221116041342.3841-1-elliott@hpe.com> References: <20221103042740.6556-1-elliott@hpe.com> <20221116041342.3841-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: SFk6rY61usHRwlIC9RezUVnsIZGLCIrq X-Proofpoint-ORIG-GUID: SFk6rY61usHRwlIC9RezUVnsIZGLCIrq X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-15_08,2022-11-15_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 impostorscore=0 spamscore=0 clxscore=1015 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160029 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Don't refuse to load modules based on missing additional x86 features (e.g., OSXSAVE) or x86 XSAVE features (e.g., YMM). Instead, load the module, but don't register any crypto drivers. Report the fact that one or more features are missing in a new missing_x86_features module parameter (0 = no problems, 1 = something is missing; each module parameter description lists all the features that it wants). For the SHA functions that register up to four drivers based on CPU features, report separate module parameters for each set: missing_x86_features_avx2 missing_x86_features_avx Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 15 ++++++++++--- arch/x86/crypto/aria_aesni_avx_glue.c | 24 +++++++++++--------- arch/x86/crypto/camellia_aesni_avx2_glue.c | 25 ++++++++++++--------- arch/x86/crypto/camellia_aesni_avx_glue.c | 25 ++++++++++++--------- arch/x86/crypto/cast5_avx_glue.c | 20 ++++++++++------- arch/x86/crypto/cast6_avx_glue.c | 20 ++++++++++------- arch/x86/crypto/curve25519-x86_64.c | 12 ++++++++-- arch/x86/crypto/nhpoly1305-avx2-glue.c | 14 +++++++++--- arch/x86/crypto/polyval-clmulni_glue.c | 15 ++++++++++--- arch/x86/crypto/serpent_avx2_glue.c | 24 +++++++++++--------- arch/x86/crypto/serpent_avx_glue.c | 21 ++++++++++------- arch/x86/crypto/sha1_ssse3_glue.c | 20 +++++++++++++---- arch/x86/crypto/sha256_ssse3_glue.c | 18 +++++++++++++-- arch/x86/crypto/sha512_ssse3_glue.c | 18 +++++++++++++-- arch/x86/crypto/sm3_avx_glue.c | 22 ++++++++++-------- arch/x86/crypto/sm4_aesni_avx2_glue.c | 26 +++++++++++++--------- arch/x86/crypto/sm4_aesni_avx_glue.c | 26 +++++++++++++--------- arch/x86/crypto/twofish_avx_glue.c | 19 ++++++++++------ 18 files changed, 243 insertions(+), 121 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index a3ebd018953c..e0312ecf34a8 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -288,6 +288,11 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (SSE2) and/or XSAVE features (SSE)"); + static struct simd_aead_alg *simd_alg; static int __init crypto_aegis128_aesni_module_init(void) @@ -296,8 +301,10 @@ static int __init crypto_aegis128_aesni_module_init(void) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_XMM2) || - !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) - return -ENODEV; + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) { + missing_x86_features = 1; + return 0; + } return simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, &simd_alg); @@ -305,7 +312,9 @@ static int __init crypto_aegis128_aesni_module_init(void) static void __exit crypto_aegis128_aesni_module_exit(void) { - simd_unregister_aeads(&crypto_aegis128_aesni_alg, 1, &simd_alg); + if (!missing_x86_features) + simd_unregister_aeads(&crypto_aegis128_aesni_alg, 1, &simd_alg); + missing_x86_features = 0; } module_init(crypto_aegis128_aesni_module_init); diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index 9fd3d1fe1105..ebb9760967b5 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -176,23 +176,25 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (AES-NI, OSXSAVE) and/or XSAVE features (SSE, YMM)"); + static int __init aria_avx_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AES or OSXSAVE instructions are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } if (boot_cpu_has(X86_FEATURE_GFNI)) { @@ -213,8 +215,10 @@ static int __init aria_avx_init(void) static void __exit aria_avx_exit(void) { - simd_unregister_skciphers(aria_algs, ARRAY_SIZE(aria_algs), - aria_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(aria_algs, ARRAY_SIZE(aria_algs), + aria_simd_algs); + missing_x86_features = 0; using_x86_gfni = 0; } diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index 6c48fc9f3fde..e8ae1e1a801d 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -105,26 +105,28 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (AES-NI, AVX, OSXSAVE) and/or XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AES-NI, AVX, or OSXSAVE instructions are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(camellia_algs, @@ -134,8 +136,11 @@ static int __init camellia_aesni_init(void) static void __exit camellia_aesni_fini(void) { - simd_unregister_skciphers(camellia_algs, ARRAY_SIZE(camellia_algs), - camellia_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(camellia_algs, + ARRAY_SIZE(camellia_algs), + camellia_simd_algs); + missing_x86_features = 0; } module_init(camellia_aesni_init); diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 6d7fc96d242e..6784d631575c 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -105,25 +105,27 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (AES-NI, OSXSAVE) and/or XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AES-NI or OSXSAVE instructions are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(camellia_algs, @@ -133,8 +135,11 @@ static int __init camellia_aesni_init(void) static void __exit camellia_aesni_fini(void) { - simd_unregister_skciphers(camellia_algs, ARRAY_SIZE(camellia_algs), - camellia_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(camellia_algs, + ARRAY_SIZE(camellia_algs), + camellia_simd_algs); + missing_x86_features = 0; } module_init(camellia_aesni_init); diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index bdc3c763334c..34ef032bb8d0 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -100,19 +100,21 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *cast5_simd_algs[ARRAY_SIZE(cast5_algs)]; static int __init cast5_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(cast5_algs, @@ -122,8 +124,10 @@ static int __init cast5_init(void) static void __exit cast5_exit(void) { - simd_unregister_skciphers(cast5_algs, ARRAY_SIZE(cast5_algs), - cast5_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(cast5_algs, ARRAY_SIZE(cast5_algs), + cast5_simd_algs); + missing_x86_features = 0; } module_init(cast5_init); diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index addca34b3511..71559fd3ea87 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -100,19 +100,21 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *cast6_simd_algs[ARRAY_SIZE(cast6_algs)]; static int __init cast6_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(cast6_algs, @@ -122,8 +124,10 @@ static int __init cast6_init(void) static void __exit cast6_exit(void) { - simd_unregister_skciphers(cast6_algs, ARRAY_SIZE(cast6_algs), - cast6_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(cast6_algs, ARRAY_SIZE(cast6_algs), + cast6_simd_algs); + missing_x86_features = 0; } module_init(cast6_init); diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index 6d222849e409..74672351e534 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -1706,13 +1706,20 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (BMI2)"); + static int __init curve25519_mod_init(void) { if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!boot_cpu_has(X86_FEATURE_BMI2)) - return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_BMI2)) { + missing_x86_features = 1; + return 0; + } static_branch_enable(&curve25519_use_bmi2_adx); @@ -1725,6 +1732,7 @@ static void __exit curve25519_mod_exit(void) if (IS_REACHABLE(CONFIG_CRYPTO_KPP) && static_branch_likely(&curve25519_use_bmi2_adx)) crypto_unregister_kpp(&curve25519_alg); + missing_x86_features = 0; } module_init(curve25519_mod_init); diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index fa415fec5793..2e63947bc9fa 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -67,20 +67,28 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (OSXSAVE)"); + static int __init nhpoly1305_mod_init(void) { if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!boot_cpu_has(X86_FEATURE_OSXSAVE)) - return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_OSXSAVE)) { + missing_x86_features = 1; + return 0; + } return crypto_register_shash(&nhpoly1305_alg); } static void __exit nhpoly1305_mod_exit(void) { - crypto_unregister_shash(&nhpoly1305_alg); + if (!missing_x86_features) + crypto_unregister_shash(&nhpoly1305_alg); } module_init(nhpoly1305_mod_init); diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index b98e32f8e2a4..20d4a68ec1d7 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -182,20 +182,29 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (AVX)"); + static int __init polyval_clmulni_mod_init(void) { if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!boot_cpu_has(X86_FEATURE_AVX)) - return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX)) { + missing_x86_features = 1; + return 0; + } return crypto_register_shash(&polyval_alg); } static void __exit polyval_clmulni_mod_exit(void) { - crypto_unregister_shash(&polyval_alg); + if (!missing_x86_features) + crypto_unregister_shash(&polyval_alg); + missing_x86_features = 0; } module_init(polyval_clmulni_mod_init); diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index bc18149fb928..2aa62c93a16f 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -101,23 +101,25 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (OSXSAVE) and/or XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_avx2_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("OSXSAVE instructions are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(serpent_algs, @@ -127,8 +129,10 @@ static int __init serpent_avx2_init(void) static void __exit serpent_avx2_fini(void) { - simd_unregister_skciphers(serpent_algs, ARRAY_SIZE(serpent_algs), - serpent_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(serpent_algs, ARRAY_SIZE(serpent_algs), + serpent_simd_algs); + missing_x86_features = 0; } module_init(serpent_avx2_init); diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 0db18d99da50..28ee9717df49 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -107,19 +107,21 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(serpent_algs, @@ -129,8 +131,11 @@ static int __init serpent_init(void) static void __exit serpent_exit(void) { - simd_unregister_skciphers(serpent_algs, ARRAY_SIZE(serpent_algs), - serpent_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(serpent_algs, + ARRAY_SIZE(serpent_algs), + serpent_simd_algs); + missing_x86_features = 0; } module_init(serpent_init); diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 2445648cf234..405af5e14b67 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -351,9 +351,17 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features_avx2; +static int missing_x86_features_avx; +module_param(missing_x86_features_avx2, int, 0444); +module_param(missing_x86_features_avx, int, 0444); +MODULE_PARM_DESC(missing_x86_features_avx2, + "Missing x86 instruction set extensions (BMI1, BMI2) to support AVX2"); +MODULE_PARM_DESC(missing_x86_features_avx, + "Missing x86 XSAVE features (SSE, YMM) to support AVX"); + static int __init sha1_ssse3_mod_init(void) { - const char *feature_name; int ret; if (!x86_match_cpu(module_cpu_ids)) @@ -374,10 +382,11 @@ static int __init sha1_ssse3_mod_init(void) if (boot_cpu_has(X86_FEATURE_BMI1) && boot_cpu_has(X86_FEATURE_BMI2)) { - ret = crypto_register_shash(&sha1_avx2_alg); if (!ret) using_x86_avx2 = 1; + } else { + missing_x86_features_avx2 = 1; } } @@ -385,11 +394,12 @@ static int __init sha1_ssse3_mod_init(void) if (boot_cpu_has(X86_FEATURE_AVX)) { if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - + NULL)) { ret = crypto_register_shash(&sha1_avx_alg); if (!ret) using_x86_avx = 1; + } else { + missing_x86_features_avx = 1; } } @@ -415,6 +425,8 @@ static void __exit sha1_ssse3_mod_fini(void) unregister_sha1_avx2(); unregister_sha1_avx(); unregister_sha1_ssse3(); + missing_x86_features_avx2 = 0; + missing_x86_features_avx = 0; } module_init(sha1_ssse3_mod_init); diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 1464e6ccf912..293cf7085dd3 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -413,9 +413,17 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features_avx2; +static int missing_x86_features_avx; +module_param(missing_x86_features_avx2, int, 0444); +module_param(missing_x86_features_avx, int, 0444); +MODULE_PARM_DESC(missing_x86_features_avx2, + "Missing x86 instruction set extensions (BMI2) to support AVX2"); +MODULE_PARM_DESC(missing_x86_features_avx, + "Missing x86 XSAVE features (SSE, YMM) to support AVX"); + static int __init sha256_ssse3_mod_init(void) { - const char *feature_name; int ret; if (!x86_match_cpu(module_cpu_ids)) @@ -440,6 +448,8 @@ static int __init sha256_ssse3_mod_init(void) ARRAY_SIZE(sha256_avx2_algs)); if (!ret) using_x86_avx2 = 1; + } else { + missing_x86_features_avx2 = 1; } } @@ -447,11 +457,13 @@ static int __init sha256_ssse3_mod_init(void) if (boot_cpu_has(X86_FEATURE_AVX)) { if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { + NULL)) { ret = crypto_register_shashes(sha256_avx_algs, ARRAY_SIZE(sha256_avx_algs)); if (!ret) using_x86_avx = 1; + } else { + missing_x86_features_avx = 1; } } @@ -478,6 +490,8 @@ static void __exit sha256_ssse3_mod_fini(void) unregister_sha256_avx2(); unregister_sha256_avx(); unregister_sha256_ssse3(); + missing_x86_features_avx2 = 0; + missing_x86_features_avx = 0; } module_init(sha256_ssse3_mod_init); diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 04e2af951a3e..9f13baf7dda9 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -319,6 +319,15 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features_avx2; +static int missing_x86_features_avx; +module_param(missing_x86_features_avx2, int, 0444); +module_param(missing_x86_features_avx, int, 0444); +MODULE_PARM_DESC(missing_x86_features_avx2, + "Missing x86 instruction set extensions (BMI2) to support AVX2"); +MODULE_PARM_DESC(missing_x86_features_avx, + "Missing x86 XSAVE features (SSE, YMM) to support AVX"); + static void unregister_sha512_avx2(void) { if (using_x86_avx2) { @@ -330,7 +339,6 @@ static void unregister_sha512_avx2(void) static int __init sha512_ssse3_mod_init(void) { - const char *feature_name; int ret; if (!x86_match_cpu(module_cpu_ids)) @@ -343,6 +351,8 @@ static int __init sha512_ssse3_mod_init(void) ARRAY_SIZE(sha512_avx2_algs)); if (!ret) using_x86_avx2 = 1; + } else { + missing_x86_features_avx2 = 1; } } @@ -350,11 +360,13 @@ static int __init sha512_ssse3_mod_init(void) if (boot_cpu_has(X86_FEATURE_AVX)) { if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { + NULL)) { ret = crypto_register_shashes(sha512_avx_algs, ARRAY_SIZE(sha512_avx_algs)); if (!ret) using_x86_avx = 1; + } else { + missing_x86_features_avx = 1; } } @@ -376,6 +388,8 @@ static void __exit sha512_ssse3_mod_fini(void) unregister_sha512_avx2(); unregister_sha512_avx(); unregister_sha512_ssse3(); + missing_x86_features_avx2 = 0; + missing_x86_features_avx = 0; } module_init(sha512_ssse3_mod_init); diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index c7786874319c..169ba6a2c806 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -126,22 +126,24 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (BMI2) and/or XSAVE features (SSE, YMM)"); + static int __init sm3_avx_mod_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_BMI2)) { - pr_info("BMI2 instruction are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return crypto_register_shash(&sm3_avx_alg); @@ -149,7 +151,9 @@ static int __init sm3_avx_mod_init(void) static void __exit sm3_avx_mod_exit(void) { - crypto_unregister_shash(&sm3_avx_alg); + if (!missing_x86_features) + crypto_unregister_shash(&sm3_avx_alg); + missing_x86_features = 0; } module_init(sm3_avx_mod_init); diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 125b00db89b1..6bcf78231888 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -133,27 +133,29 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (AES-NI, AVX, OSXSAVE) and/or XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg * simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)]; static int __init sm4_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX, AES-NI, and/or OSXSAVE instructions are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(sm4_aesni_avx2_skciphers, @@ -163,9 +165,11 @@ static int __init sm4_init(void) static void __exit sm4_exit(void) { - simd_unregister_skciphers(sm4_aesni_avx2_skciphers, - ARRAY_SIZE(sm4_aesni_avx2_skciphers), - simd_sm4_aesni_avx2_skciphers); + if (!missing_x86_features) + simd_unregister_skciphers(sm4_aesni_avx2_skciphers, + ARRAY_SIZE(sm4_aesni_avx2_skciphers), + simd_sm4_aesni_avx2_skciphers); + missing_x86_features = 0; } module_init(sm4_init); diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index ac8182b197cf..03775b1079dc 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -452,26 +452,28 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 instruction set extensions (AES-NI, OSXSAVE) and/or XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg * simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)]; static int __init sm4_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AES-NI or OSXSAVE instructions are not detected.\n"); - return -ENODEV; + missing_x86_features = 1; + return 0; } - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, - &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(sm4_aesni_avx_skciphers, @@ -481,9 +483,11 @@ static int __init sm4_init(void) static void __exit sm4_exit(void) { - simd_unregister_skciphers(sm4_aesni_avx_skciphers, - ARRAY_SIZE(sm4_aesni_avx_skciphers), - simd_sm4_aesni_avx_skciphers); + if (!missing_x86_features) + simd_unregister_skciphers(sm4_aesni_avx_skciphers, + ARRAY_SIZE(sm4_aesni_avx_skciphers), + simd_sm4_aesni_avx_skciphers); + missing_x86_features = 0; } module_init(sm4_init); diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 4657e6efc35d..ae3cc4ad6f4f 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -110,18 +110,21 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); +static int missing_x86_features; +module_param(missing_x86_features, int, 0444); +MODULE_PARM_DESC(missing_x86_features, + "Missing x86 XSAVE features (SSE, YMM)"); + static struct simd_skcipher_alg *twofish_simd_algs[ARRAY_SIZE(twofish_algs)]; static int __init twofish_init(void) { - const char *feature_name; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); - return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + missing_x86_features = 1; + return 0; } return simd_register_skciphers_compat(twofish_algs, @@ -131,8 +134,10 @@ static int __init twofish_init(void) static void __exit twofish_exit(void) { - simd_unregister_skciphers(twofish_algs, ARRAY_SIZE(twofish_algs), - twofish_simd_algs); + if (!missing_x86_features) + simd_unregister_skciphers(twofish_algs, ARRAY_SIZE(twofish_algs), + twofish_simd_algs); + missing_x86_features = 0; } module_init(twofish_init);