From patchwork Wed Jan 26 15:10:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5847BC2BA4C for ; Wed, 26 Jan 2022 15:11:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242588AbiAZPLl (ORCPT ); Wed, 26 Jan 2022 10:11:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242587AbiAZPLl (ORCPT ); Wed, 26 Jan 2022 10:11:41 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A063C06161C for ; Wed, 26 Jan 2022 07:11:41 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BF39BB81E3A for ; Wed, 26 Jan 2022 15:11:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 06494C340ED; Wed, 26 Jan 2022 15:11:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209898; bh=+7q3tm71SwRhuirdKp9GKi5rtB4x6+OdJkLWZ94Mc9g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OywQjmwQgTzKX8O6GeH1KqcaMIOCny0KQU/4YUO8gwDpb3//7lmJLvV/8NLJxV/q5 ag9+7/0aK1PPQwQw+Xq5D3OIi0jayb5fFonsoW9TtKo/tHy0PHC84gjC8a5UF/iORH peXlJIRyplOrMvBxXtxhEwAbWmeBidhr8RQ78zp98mqiXIUrNFELnlMNOjVNAsv6LK E41M8/VVfejSHCcSa1MH3OVBUr9CCbMoMMtRooBLAu/P7THDu+abw7NUcGqK+M6SFF 6mRyOWhpu9uU4MDvWd5OtgwH2vvqUcEL1LpnbYyo0OdVvjNM3iSbKd7rmAqg0OnHUu c+LxKL/lmEy3A== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 01/40] arm64: Define CPACR_EL1_FPEN similarly to other floating point controls Date: Wed, 26 Jan 2022 15:10:41 +0000 Message-Id: <20220126151120.3811248-2-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1837; h=from:subject; bh=+7q3tm71SwRhuirdKp9GKi5rtB4x6+OdJkLWZ94Mc9g=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR3xPwduzBtzQGWnI6lrwWCEEEUoE/eJxky1JgR ndA8iA+JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkdwAKCRAk1otyXVSH0LCAB/ 9hezGWJNPKVa612fhSfevbONlcnfaUMTl86av0wM0s6308x6oajAf81Y0krzoEEzYQ6LtwIqlvaT5G TVORg0iXIwCt6P03caZZHkg/yvIkng+f3Q6qh47RfwAwupJ9RHJ/X2q/EzCupSUuGXsys7xMPR626y BVvloXlj2N4k+zV34hFoplm30F2R41jfmi98W/4P+7UhpnUwOTGQhz/annLIEjw0T10PfEYCdCC3Jx TiA3Y87NdWakX1vv8TLwSnA+A/O9pr9Yb5uAaoZ88oIaBiXpDEbKeLaGc82T7pg+Bo9pUhhlVdvxdA Cb+iwT+PxVC2ctv+BMdVomSyoSHIk9 X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The base floating point, SVE and SME all have enable controls for EL0 and EL1 in CPACR_EL1 which have a similar layout and function. Currently the basic floating point enable FPEN is defined differently to the SVE control, specified as a single define in kvm_arm.h rather than in sysreg.h. Move the define to sysreg.h and provide separate EL0 and EL1 control bits so code managing the different floating point enables can look consistent. Signed-off-by: Mark Brown --- arch/arm64/include/asm/kvm_arm.h | 1 - arch/arm64/include/asm/sysreg.h | 4 ++++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 01d47c5886dc..eec790842fe2 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -355,7 +355,6 @@ ECN(SOFTSTP_CUR), ECN(WATCHPT_LOW), ECN(WATCHPT_CUR), \ ECN(BKPT32), ECN(VECTOR32), ECN(BRK64) -#define CPACR_EL1_FPEN (3 << 20) #define CPACR_EL1_TTA (1 << 28) #define CPACR_EL1_DEFAULT (CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 898bee0004ae..1da4c43d597d 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -1097,6 +1097,10 @@ #define ZCR_ELx_LEN_SIZE 9 #define ZCR_ELx_LEN_MASK 0x1ff +#define CPACR_EL1_FPEN_EL1EN (BIT(20)) /* enable EL1 access */ +#define CPACR_EL1_FPEN_EL0EN (BIT(21)) /* enable EL0 access, if EL1EN set */ +#define CPACR_EL1_FPEN (CPACR_EL1_FPEN_EL1EN | CPACR_EL1_FPEN_EL0EN) + #define CPACR_EL1_ZEN_EL1EN (BIT(16)) /* enable EL1 access */ #define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */ #define CPACR_EL1_ZEN (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN) From patchwork Wed Jan 26 15:10:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537028 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DCE9C63684 for ; Wed, 26 Jan 2022 15:11:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242589AbiAZPLp (ORCPT ); Wed, 26 Jan 2022 10:11:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242587AbiAZPLn (ORCPT ); Wed, 26 Jan 2022 10:11:43 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5392DC06161C for ; Wed, 26 Jan 2022 07:11:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E53476143D for ; Wed, 26 Jan 2022 15:11:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E49F0C340E3; Wed, 26 Jan 2022 15:11:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209902; bh=nsQjmDdgPJtJfJxvvZpWGXRhAfTHUXOJuIX2lFKVELQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ikYSaVwxCURBPOpZKj8VEyAWuS506/A0dIZ7IjkbGmnUw7s660gEQDz/CUkMtitKz 8Se0JxL2+XE0E+U4JPSiClKxRYwaH1ZLabSM3cHtNBuwbNF0howIBYdGta4keK3rc7 2+wzrS3NixLMbgaA4dh1PunUAmyWgyCKeG4jxstoOpCBn1K87Xk6xycnhGqpHmO3Zb 4I2RrjVXmOsI6aCC0VXckYSkII8KI1aIGYIi11qyBq4hS95npSeP/sGF3qdFHaGpEN uZKB7UyUWvfNktym04GBY7J5fIvERjsba+OfLYUmk5cErs/N6xDPnp4mhxUT/yOa8y ANdpE2Az0QYCw== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 02/40] arm64: Always use individual bits in CPACR floating point enables Date: Wed, 26 Jan 2022 15:10:42 +0000 Message-Id: <20220126151120.3811248-3-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3603; h=from:subject; bh=nsQjmDdgPJtJfJxvvZpWGXRhAfTHUXOJuIX2lFKVELQ=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR4eSQHAuaI8AaEmsoOOScY6hIzepTAdNuqKvGd q7qdl8KJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkeAAKCRAk1otyXVSH0J46B/ 4yQBmNBiGZsD7XBLBiPsCet5skV3CgNHGneC/s7vQqN/uvuJfiGRwnfLVKq9f6aixf6l+oMcO49Yh1 S3mnI5B3yahcoChZc5emqOKuwNZJW5EPvAXcak1WkH9zFj+guszDjAo8O+NN7LPU5q23t38nmgqJRU gIh3aal4qlGGiVU3QP5g6oEWwJufXYXLwUk400v5VZajXTyBAvfVNvRN5oF29oX7V48BlqaNfxgp/V 9ShT416+2cynxe1vsCzfxfNmm6VBMrEdk3kMOEB66qNhfACsrxM7yFNI/v4Ykr/AWeEDBL22FzUd5U IpLtCcvwWzlHVXj+tVjppanvEQE2Jw X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org CPACR_EL1 has several bitfields for controlling traps for floating point features to EL1, each of which has a separate bits for EL0 and EL1. Marc Zyngier noted that we are not consistent in our use of defines to manipulate these, sometimes using a define covering the whole field and sometimes using defines for the individual bits. Make this consistent by expanding the whole field defines where they are used (currently only in the KVM code) and deleting them so that no further uses can be introduced. Suggested-by: Marc Zyngier Signed-off-by: Mark Brown --- arch/arm64/include/asm/kvm_arm.h | 3 ++- arch/arm64/include/asm/sysreg.h | 2 -- arch/arm64/kvm/hyp/include/hyp/switch.h | 4 ++-- arch/arm64/kvm/hyp/vhe/switch.c | 6 +++--- 4 files changed, 7 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index eec790842fe2..1767ded83888 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -356,6 +356,7 @@ ECN(BKPT32), ECN(VECTOR32), ECN(BRK64) #define CPACR_EL1_TTA (1 << 28) -#define CPACR_EL1_DEFAULT (CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN) +#define CPACR_EL1_DEFAULT (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |\ + CPACR_EL1_ZEN_EL1EN) #endif /* __ARM64_KVM_ARM_H__ */ diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 1da4c43d597d..e66dd9ebc337 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -1099,11 +1099,9 @@ #define CPACR_EL1_FPEN_EL1EN (BIT(20)) /* enable EL1 access */ #define CPACR_EL1_FPEN_EL0EN (BIT(21)) /* enable EL0 access, if EL1EN set */ -#define CPACR_EL1_FPEN (CPACR_EL1_FPEN_EL1EN | CPACR_EL1_FPEN_EL0EN) #define CPACR_EL1_ZEN_EL1EN (BIT(16)) /* enable EL1 access */ #define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */ -#define CPACR_EL1_ZEN (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN) /* TCR EL1 Bit Definitions */ #define SYS_TCR_EL1_TCMA1 (BIT(58)) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 58e14f8ead23..4689b0fe7f4d 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -174,9 +174,9 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) /* Valid trap. Switch the context: */ if (has_vhe()) { - reg = CPACR_EL1_FPEN; + reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN; if (sve_guest) - reg |= CPACR_EL1_ZEN; + reg |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN; sysreg_clear_set(cpacr_el1, 0, reg); } else { diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 11d053fdd604..619353b06e38 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -38,7 +38,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu) val = read_sysreg(cpacr_el1); val |= CPACR_EL1_TTA; - val &= ~CPACR_EL1_ZEN; + val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN); /* * With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to @@ -53,9 +53,9 @@ static void __activate_traps(struct kvm_vcpu *vcpu) if (update_fp_enabled(vcpu)) { if (vcpu_has_sve(vcpu)) - val |= CPACR_EL1_ZEN; + val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN; } else { - val &= ~CPACR_EL1_FPEN; + val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN); __activate_traps_fpsimd32(vcpu); } From patchwork Wed Jan 26 15:10:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537372 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A3C4C28CF5 for ; Wed, 26 Jan 2022 15:11:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242591AbiAZPLs (ORCPT ); Wed, 26 Jan 2022 10:11:48 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:40178 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPLr (ORCPT ); Wed, 26 Jan 2022 10:11:47 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 154106143D for ; Wed, 26 Jan 2022 15:11:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D04ECC340E9; Wed, 26 Jan 2022 15:11:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209906; bh=wmDZvOjgxdLFa7bgm9rdffr3DT9HBunv56+iDfONLzU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QjYQCe4t9jZ8keh1dcVJB130WFPRZgLMEVpZCbV/FU3Uz+cAXwhCE/a0F/9fn2iVi ftKcZ2as1zbenRS3V8WGzrRLX1wyzARV0CjWbbNj0hiP7CZgyulsEcdR1143K2mDRv 8Cg80UI7t4FuyRXBFxbI1Lec+SyJGY2gO3+wpSet25pnJKfDrKrxH7uqgWKDJ5QWCo 3VwQfVIhYHmbJBMezrwmDpllcP8ZItR7JXaKv+5ymaKgTigGjqLvjH1P0Oo7QFeK4g XBpXK3Vg+P3+A2jgTezlYcjABQTYWXcOAvOZDQPo+MQM/wXw3a0d2lncRTZqgvuDmJ kd2kQ4DIVms/g== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 03/40] arm64: cpufeature: Always specify and use a field width for capabilities Date: Wed, 26 Jan 2022 15:10:43 +0000 Message-Id: <20220126151120.3811248-4-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=26719; h=from:subject; bh=wmDZvOjgxdLFa7bgm9rdffr3DT9HBunv56+iDfONLzU=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR400RIvp+uMj/du/Zxg/mQGM4uZ+A7Yu0nJYSE 1Buc6cCJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkeAAKCRAk1otyXVSH0BscB/ 9m8U+f+tSrxeayfb5QqeteICFligkDVjjDzeCNc6OdMZMiWZIu7F+l/Y6W34hct8IR7AWbkzthec/c uia67nFPd5h9C+T4qa+MY17yg5OP6Ylj+F4A56aW1L/PwLBQgLjkBF18jtihQMBhX69r7VZfZnfr/N f4JouCsiHwFrN8ncRd96JUwK8OT/yHyZgYvuzLKaRtIoGj5+xP5SjRwIhHguNC5HBP3zmy0r7pNFlU 3b+4iKsRXc6tWGI1GlhaNqhj4bsB+SknjyNI8HAKl5+IHsOo9d/cFo9CepP9BFasQurkc6Qr9ihZJ9 tJ7w4skbC1NxxG2JpoG2iUaE7esX1k X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Since all the fields in the main ID registers are 4 bits wide we have up until now not bothered specifying the width in the code. Since we now wish to use this mechanism to enumerate features from the floating point feature registers which do not follow this pattern add a width to the table. This means updating all the existing table entries but makes it less likely that we run into issues in future due to implicitly assuming a 4 bit width. Signed-off-by: Mark Brown Cc: Suzuki K Poulose --- arch/arm64/include/asm/cpufeature.h | 1 + arch/arm64/kernel/cpufeature.c | 167 +++++++++++++++++----------- 2 files changed, 102 insertions(+), 66 deletions(-) diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index ef6be92b1921..2728abd9cae4 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -356,6 +356,7 @@ struct arm64_cpu_capabilities { struct { /* Feature register checking */ u32 sys_reg; u8 field_pos; + u8 field_width; u8 min_field_value; u8 hwcap_type; bool sign; diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index a46ab3b1c4d5..d9f09e40aaf6 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1307,7 +1307,9 @@ u64 __read_sysreg_by_encoding(u32 sys_id) static bool feature_matches(u64 reg, const struct arm64_cpu_capabilities *entry) { - int val = cpuid_feature_extract_field(reg, entry->field_pos, entry->sign); + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, + entry->field_width, + entry->sign); return val >= entry->min_field_value; } @@ -1952,6 +1954,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64MMFR0_EL1, .field_pos = ID_AA64MMFR0_ECV_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 1, }, @@ -1963,6 +1966,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64MMFR1_EL1, .field_pos = ID_AA64MMFR1_PAN_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 1, .cpu_enable = cpu_enable_pan, @@ -1976,6 +1980,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64MMFR1_EL1, .field_pos = ID_AA64MMFR1_PAN_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 3, }, @@ -1988,6 +1993,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR0_EL1, .field_pos = ID_AA64ISAR0_ATOMICS_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 2, }, @@ -2012,6 +2018,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64PFR0_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64PFR0_EL0_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR0_ELx_32BIT_64BIT, }, #ifdef CONFIG_KVM @@ -2023,6 +2030,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64PFR0_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64PFR0_EL1_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR0_ELx_32BIT_64BIT, }, { @@ -2043,6 +2051,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { */ .sys_reg = SYS_ID_AA64PFR0_EL1, .field_pos = ID_AA64PFR0_CSV3_SHIFT, + .field_width = 4, .min_field_value = 1, .matches = unmap_kernel_at_el0, .cpu_enable = kpti_install_ng_mappings, @@ -2062,6 +2071,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR1_EL1, .field_pos = ID_AA64ISAR1_DPB_SHIFT, + .field_width = 4, .min_field_value = 1, }, { @@ -2072,6 +2082,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_DPB_SHIFT, + .field_width = 4, .min_field_value = 2, }, #endif @@ -2083,6 +2094,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64PFR0_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64PFR0_SVE_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR0_SVE, .matches = has_cpuid_feature, .cpu_enable = sve_kernel_enable, @@ -2097,6 +2109,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64PFR0_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64PFR0_RAS_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR0_RAS_V1, .cpu_enable = cpu_clear_disr, }, @@ -2115,6 +2128,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64PFR0_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64PFR0_AMU_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR0_AMU, .cpu_enable = cpu_amu_enable, }, @@ -2139,6 +2153,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64MMFR2_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64MMFR2_FWB_SHIFT, + .field_width = 4, .min_field_value = 1, .matches = has_cpuid_feature, .cpu_enable = cpu_has_fwb, @@ -2150,6 +2165,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64MMFR2_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64MMFR2_TTL_SHIFT, + .field_width = 4, .min_field_value = 1, .matches = has_cpuid_feature, }, @@ -2160,6 +2176,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR0_EL1, .field_pos = ID_AA64ISAR0_TLB_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = ID_AA64ISAR0_TLB_RANGE, }, @@ -2178,6 +2195,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64MMFR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64MMFR1_HADBS_SHIFT, + .field_width = 4, .min_field_value = 2, .matches = has_hw_dbm, .cpu_enable = cpu_enable_hw_dbm, @@ -2190,6 +2208,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR0_EL1, .field_pos = ID_AA64ISAR0_CRC32_SHIFT, + .field_width = 4, .min_field_value = 1, }, { @@ -2199,6 +2218,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64PFR1_EL1, .field_pos = ID_AA64PFR1_SSBS_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = ID_AA64PFR1_SSBS_PSTATE_ONLY, }, @@ -2211,6 +2231,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64MMFR2_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64MMFR2_CNP_SHIFT, + .field_width = 4, .min_field_value = 1, .cpu_enable = cpu_enable_cnp, }, @@ -2222,6 +2243,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR1_EL1, .field_pos = ID_AA64ISAR1_SB_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 1, }, @@ -2233,6 +2255,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_APA_SHIFT, + .field_width = 4, .min_field_value = ID_AA64ISAR1_APA_ARCHITECTED, .matches = has_address_auth_cpucap, }, @@ -2243,6 +2266,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_API_SHIFT, + .field_width = 4, .min_field_value = ID_AA64ISAR1_API_IMP_DEF, .matches = has_address_auth_cpucap, }, @@ -2258,6 +2282,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_GPA_SHIFT, + .field_width = 4, .min_field_value = ID_AA64ISAR1_GPA_ARCHITECTED, .matches = has_cpuid_feature, }, @@ -2268,6 +2293,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_GPI_SHIFT, + .field_width = 4, .min_field_value = ID_AA64ISAR1_GPI_IMP_DEF, .matches = has_cpuid_feature, }, @@ -2288,6 +2314,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = can_use_gic_priorities, .sys_reg = SYS_ID_AA64PFR0_EL1, .field_pos = ID_AA64PFR0_GIC_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 1, }, @@ -2299,6 +2326,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .type = ARM64_CPUCAP_SYSTEM_FEATURE, .sys_reg = SYS_ID_AA64MMFR2_EL1, .sign = FTR_UNSIGNED, + .field_width = 4, .field_pos = ID_AA64MMFR2_E0PD_SHIFT, .matches = has_cpuid_feature, .min_field_value = 1, @@ -2313,6 +2341,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR0_EL1, .field_pos = ID_AA64ISAR0_RNDR_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 1, }, @@ -2330,6 +2359,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .cpu_enable = bti_enable, .sys_reg = SYS_ID_AA64PFR1_EL1, .field_pos = ID_AA64PFR1_BT_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR1_BT_BTI, .sign = FTR_UNSIGNED, }, @@ -2342,6 +2372,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64PFR1_EL1, .field_pos = ID_AA64PFR1_MTE_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR1_MTE, .sign = FTR_UNSIGNED, .cpu_enable = cpu_enable_mte, @@ -2353,6 +2384,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64PFR1_EL1, .field_pos = ID_AA64PFR1_MTE_SHIFT, + .field_width = 4, .min_field_value = ID_AA64PFR1_MTE_ASYMM, .sign = FTR_UNSIGNED, }, @@ -2364,16 +2396,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_LRCPC_SHIFT, + .field_width = 4, .matches = has_cpuid_feature, .min_field_value = 1, }, {}, }; -#define HWCAP_CPUID_MATCH(reg, field, s, min_value) \ +#define HWCAP_CPUID_MATCH(reg, field, width, s, min_value) \ .matches = has_cpuid_feature, \ .sys_reg = reg, \ .field_pos = field, \ + .field_width = width, \ .sign = s, \ .min_field_value = min_value, @@ -2383,10 +2417,10 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .hwcap_type = cap_type, \ .hwcap = cap, \ -#define HWCAP_CAP(reg, field, s, min_value, cap_type, cap) \ +#define HWCAP_CAP(reg, field, width, s, min_value, cap_type, cap) \ { \ __HWCAP_CAP(#cap, cap_type, cap) \ - HWCAP_CPUID_MATCH(reg, field, s, min_value) \ + HWCAP_CPUID_MATCH(reg, field, width, s, min_value) \ } #define HWCAP_MULTI_CAP(list, cap_type, cap) \ @@ -2406,11 +2440,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = { static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = { { HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_APA_SHIFT, - FTR_UNSIGNED, ID_AA64ISAR1_APA_ARCHITECTED) + 4, FTR_UNSIGNED, + ID_AA64ISAR1_APA_ARCHITECTED) }, { HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_API_SHIFT, - FTR_UNSIGNED, ID_AA64ISAR1_API_IMP_DEF) + 4, FTR_UNSIGNED, ID_AA64ISAR1_API_IMP_DEF) }, {}, }; @@ -2418,77 +2453,77 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = { static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = { { HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_GPA_SHIFT, - FTR_UNSIGNED, ID_AA64ISAR1_GPA_ARCHITECTED) + 4, FTR_UNSIGNED, ID_AA64ISAR1_GPA_ARCHITECTED) }, { HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_GPI_SHIFT, - FTR_UNSIGNED, ID_AA64ISAR1_GPI_IMP_DEF) + 4, FTR_UNSIGNED, ID_AA64ISAR1_GPI_IMP_DEF) }, {}, }; #endif static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_PMULL), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AES), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SHA1), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SHA2), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_SHA512), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_CRC32_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_CRC32), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_ATOMICS_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_ATOMICS), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_RDM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDRDM), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SHA3), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SM3), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM4_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SM4), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_DP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDDP), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_FHM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDFHM), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_TS_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FLAGM), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_TS_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_FLAGM2), - HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_RNDR_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RNG), - HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, KERNEL_HWCAP_FP), - HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FPHP), - HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, KERNEL_HWCAP_ASIMD), - HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP), - HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_DIT_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DIT), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DCPOP), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_DCPODP), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_JSCVT), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FCMA), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_LRCPC), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_ILRCPC), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FRINTTS_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FRINT), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_SB_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SB), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_BF16_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_BF16), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DGH_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DGH), - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_I8MM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_I8MM), - HWCAP_CAP(SYS_ID_AA64MMFR2_EL1, ID_AA64MMFR2_AT_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_USCAT), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_PMULL), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AES), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA1_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SHA1), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SHA2), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_SHA512), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_CRC32_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_CRC32), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_ATOMICS_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_ATOMICS), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_RDM_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDRDM), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA3_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SHA3), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM3_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SM3), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM4_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SM4), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_DP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDDP), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_FHM_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDFHM), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_TS_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FLAGM), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_TS_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_FLAGM2), + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_RNDR_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RNG), + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, 4, FTR_SIGNED, 0, CAP_HWCAP, KERNEL_HWCAP_FP), + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, 4, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FPHP), + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, 4, FTR_SIGNED, 0, CAP_HWCAP, KERNEL_HWCAP_ASIMD), + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, 4, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP), + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_DIT_SHIFT, 4, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DIT), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DCPOP), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_DCPODP), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_JSCVT), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FCMA), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_LRCPC), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_ILRCPC), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FRINTTS_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FRINT), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_SB_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SB), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_BF16_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_BF16), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DGH_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DGH), + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_I8MM_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_I8MM), + HWCAP_CAP(SYS_ID_AA64MMFR2_EL1, ID_AA64MMFR2_AT_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_USCAT), #ifdef CONFIG_ARM64_SVE - HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_SVE_SHIFT, FTR_UNSIGNED, ID_AA64PFR0_SVE, CAP_HWCAP, KERNEL_HWCAP_SVE), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SVEVER_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_SVEVER_SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_AES_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_AES, CAP_HWCAP, KERNEL_HWCAP_SVEAES), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_AES_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_AES_PMULL, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_BITPERM_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_BITPERM, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_BF16_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_BF16, CAP_HWCAP, KERNEL_HWCAP_SVEBF16), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SHA3_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_SHA3, CAP_HWCAP, KERNEL_HWCAP_SVESHA3), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SM4_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_SM4, CAP_HWCAP, KERNEL_HWCAP_SVESM4), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_I8MM_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_I8MM, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_F32MM_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_F32MM, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM), - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_F64MM_SHIFT, FTR_UNSIGNED, ID_AA64ZFR0_F64MM, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM), + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_SVE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR0_SVE, CAP_HWCAP, KERNEL_HWCAP_SVE), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SVEVER_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_SVEVER_SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_AES_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_AES, CAP_HWCAP, KERNEL_HWCAP_SVEAES), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_AES_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_AES_PMULL, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_BITPERM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_BITPERM, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_BF16_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_BF16, CAP_HWCAP, KERNEL_HWCAP_SVEBF16), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SHA3_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_SHA3, CAP_HWCAP, KERNEL_HWCAP_SVESHA3), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SM4_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_SM4, CAP_HWCAP, KERNEL_HWCAP_SVESM4), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_I8MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_I8MM, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_F32MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_F32MM, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM), + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_F64MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_F64MM, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM), #endif - HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SSBS_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_SSBS_PSTATE_INSNS, CAP_HWCAP, KERNEL_HWCAP_SSBS), + HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SSBS_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_SSBS_PSTATE_INSNS, CAP_HWCAP, KERNEL_HWCAP_SSBS), #ifdef CONFIG_ARM64_BTI - HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_BT_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_BT_BTI, CAP_HWCAP, KERNEL_HWCAP_BTI), + HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_BT_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_BT_BTI, CAP_HWCAP, KERNEL_HWCAP_BTI), #endif #ifdef CONFIG_ARM64_PTR_AUTH HWCAP_MULTI_CAP(ptr_auth_hwcap_addr_matches, CAP_HWCAP, KERNEL_HWCAP_PACA), HWCAP_MULTI_CAP(ptr_auth_hwcap_gen_matches, CAP_HWCAP, KERNEL_HWCAP_PACG), #endif #ifdef CONFIG_ARM64_MTE - HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE), + HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE), #endif /* CONFIG_ARM64_MTE */ - HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV), - HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP), - HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES), + HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV), + HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP), + HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES), {}, }; @@ -2517,15 +2552,15 @@ static bool compat_has_neon(const struct arm64_cpu_capabilities *cap, int scope) static const struct arm64_cpu_capabilities compat_elf_hwcaps[] = { #ifdef CONFIG_COMPAT HWCAP_CAP_MATCH(compat_has_neon, CAP_COMPAT_HWCAP, COMPAT_HWCAP_NEON), - HWCAP_CAP(SYS_MVFR1_EL1, MVFR1_SIMDFMAC_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv4), + HWCAP_CAP(SYS_MVFR1_EL1, MVFR1_SIMDFMAC_SHIFT, 4, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv4), /* Arm v8 mandates MVFR0.FPDP == {0, 2}. So, piggy back on this for the presence of VFP support */ - HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFP), - HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv3), - HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL), - HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES), - HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1), - HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA2_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA2), - HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_CRC32_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_CRC32), + HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, 4, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFP), + HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, 4, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv3), + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, 4, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL), + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES), + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, 4, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1), + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA2_SHIFT, 4, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA2), + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_CRC32_SHIFT, 4, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_CRC32), #endif {}, }; From patchwork Wed Jan 26 15:10:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EE0CC28CF5 for ; Wed, 26 Jan 2022 15:11:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242592AbiAZPLw (ORCPT ); Wed, 26 Jan 2022 10:11:52 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:40244 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPLv (ORCPT ); Wed, 26 Jan 2022 10:11:51 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EDCDA61805 for ; Wed, 26 Jan 2022 15:11:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB488C340EC; Wed, 26 Jan 2022 15:11:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209910; bh=cINPFgKL2Ut8k3Oy2wD5fkInRC1CnwPrnusLBuDdThE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tT9VCyBdrS3B9ZoFShxsqCz0QxtS2GdjGFhNhWm+Wxm0+/lN55eFms5GjA9WpiJ5B k3FJTgTxkX3Hc7akhF0ks3701jAyyElK4OgnIsD7BgtZdN9hWldYMyyxGZfhQoMVWB NwAcwiEduh9nvseyckZkgVQquWKfauiBDzPBg6C6YSY1i4eHgklH9HunM+v2I6fchF LWWo4E6yKE3fdEW25zXzQoJn+0eJkdlMWwMyOAYgJV3t8gD9ltQTOmqsRUh9ojyPp3 cnFnq9j4nWBKNq5fGJZnXsxhTDcY73fYVkubMsEu9xKdQoEIfnaKfXBYLXBCgHbIkI OcdXOa4nSPlCg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 04/40] kselftest/arm64: Remove local ARRAY_SIZE() definitions Date: Wed, 26 Jan 2022 15:10:44 +0000 Message-Id: <20220126151120.3811248-5-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1265; h=from:subject; bh=cINPFgKL2Ut8k3Oy2wD5fkInRC1CnwPrnusLBuDdThE=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR5gEzITa3c+LpcX6BANIKS7MW2GTUq48+G78Hk TYKq/JWJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkeQAKCRAk1otyXVSH0DvxB/ 9DJRa/JLX29ujvL1yvZ0g9P3+KR265G4LetYNuU7RKHVV5gpVLg+fixOk08tTNFfOXsr+WLckAX6TN Prbs25szdV2Cbxv1CuWeZsvvBps90ydYJDEYqIatKaFKofhE99c2dzychkpYJUe+TyhHXuC6/wMZbI nTjsR8IVrtsNQj++yMjRbLe7jrbTnlqvCpsifzHegxb/rkUJHmFjHBuikAuxV5pEBkCSzfFCEe+qwk okOVELgMp5z9ah7y9hcXi5cRZn6t+EuUwRYh/khA/FNohoBEMBiAchgH55OZymASTGGnB9yaI0pFhR ulIiHCJcBqbTaQR2sZA1yk85U+zQv/ X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org An ARRAY_SIZE() has been added to kselftest.h so remove the local versions in some of the arm64 selftests. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/abi/syscall-abi.c | 1 - tools/testing/selftests/arm64/fp/sve-ptrace.c | 2 -- 2 files changed, 3 deletions(-) diff --git a/tools/testing/selftests/arm64/abi/syscall-abi.c b/tools/testing/selftests/arm64/abi/syscall-abi.c index d8eeeafb50dc..1e13b7523918 100644 --- a/tools/testing/selftests/arm64/abi/syscall-abi.c +++ b/tools/testing/selftests/arm64/abi/syscall-abi.c @@ -18,7 +18,6 @@ #include "../../kselftest.h" -#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0])) #define NUM_VL ((SVE_VQ_MAX - SVE_VQ_MIN) + 1) extern void do_syscall(int sve_vl); diff --git a/tools/testing/selftests/arm64/fp/sve-ptrace.c b/tools/testing/selftests/arm64/fp/sve-ptrace.c index af798b9d232c..90ba1d6a6781 100644 --- a/tools/testing/selftests/arm64/fp/sve-ptrace.c +++ b/tools/testing/selftests/arm64/fp/sve-ptrace.c @@ -21,8 +21,6 @@ #include "../../kselftest.h" -#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0])) - /* and don't like each other, so: */ #ifndef NT_ARM_SVE #define NT_ARM_SVE 0x405 From patchwork Wed Jan 26 15:10:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1224DC2BA4C for ; Wed, 26 Jan 2022 15:11:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242587AbiAZPL5 (ORCPT ); Wed, 26 Jan 2022 10:11:57 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:40930 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPL4 (ORCPT ); Wed, 26 Jan 2022 10:11:56 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 86578B81E3A for ; Wed, 26 Jan 2022 15:11:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CE230C340E8; Wed, 26 Jan 2022 15:11:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209914; bh=LILDRv5nj5WSygqoRB1U9j0mnm4jU4VyQjP7vWy8K4A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hqM9qn1w3L+FNzGjZC1lfW0VJKD5v6Q+YbZN9biC5WPtrUTIrdS11pt7aZb48++5j L8Xc11of5VQelLGT8mXEdvUsGnLIk2pAYT8vvb6j3L0BQPc7opKYXjxafyqyT1UxUp fRL61nPoBHPNyBbUA2FWsOef3aWJUXQYHKnAmZAjWOtR+3Iiz7ydYe3RyKf3qy59hk FJWPd0SeFuu6NtSiWvx42650VizVNhOtFb1oUM56C5qH/L92SSkZgv8a9dTxDZqzpg g2IqNgNJUo3IPEFkZuVqqamzesDafeFQNSRz8BGj3FcCh0bz/Kq7Xvn8zkOdFlBEEF jsjtFNCxschEQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 05/40] arm64/sme: Provide ABI documentation for SME Date: Wed, 26 Jan 2022 15:10:45 +0000 Message-Id: <20220126151120.3811248-6-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=25779; h=from:subject; bh=LILDRv5nj5WSygqoRB1U9j0mnm4jU4VyQjP7vWy8K4A=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR6OEtcuSEtc9fSkU3TG4WsW13ZDXHLhwZfXfqA 3e+H2TyJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkegAKCRAk1otyXVSH0Eo4B/ 9IFit9dxXzFssQmgTr5/1oxUVmQBZa+ZkqgqI1le5EHvqvzyku/ph6ZUvi3hHq5jN45RajeaKsr9Tm Af5OeKBZ3JqSQHJ2jwz9WQaMFoR2ikB+wLfC2BqYvAsaTWjsg2ewdlGJyhqolberFZgVApH2/6lNf9 nionycyBnZoSsOvZozyUZPyX7M2mYVL1JkX8H4yfGeem+qrt0awVo9nCMktVMEXMjtXelB1cxq3aJE AIIq2BNzX6H/RHZVpTK+sYT0/cPXhoNW1T9X0yZVPhDKhJZ/Au0noi4t/WshJ7gRNNgciOsCfLvM1P kOzAQ56f5NTEZSBstBmziPBn1bKI2C X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Provide ABI documentation for SME similar to that for SVE. Due to the very large overlap around streaming SVE mode in both implementation and interfaces documentation for streaming mode SVE is added to the SVE document rather than the SME one. Signed-off-by: Mark Brown --- Documentation/arm64/index.rst | 1 + Documentation/arm64/sme.rst | 432 ++++++++++++++++++++++++++++++++++ Documentation/arm64/sve.rst | 70 +++++- 3 files changed, 493 insertions(+), 10 deletions(-) create mode 100644 Documentation/arm64/sme.rst diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst index 4f840bac083e..ae21f8118830 100644 --- a/Documentation/arm64/index.rst +++ b/Documentation/arm64/index.rst @@ -21,6 +21,7 @@ ARM64 Architecture perf pointer-authentication silicon-errata + sme sve tagged-address-abi tagged-pointers diff --git a/Documentation/arm64/sme.rst b/Documentation/arm64/sme.rst new file mode 100644 index 000000000000..15df3157c592 --- /dev/null +++ b/Documentation/arm64/sme.rst @@ -0,0 +1,432 @@ +=================================================== +Scalable Matrix Extension support for AArch64 Linux +=================================================== + +This document outlines briefly the interface provided to userspace by Linux in +order to support use of the ARM Scalable Matrix Extension (SME). + +This is an outline of the most important features and issues only and not +intended to be exhaustive. It should be read in conjunction with the SVE +documentation in sve.rst which provides details on the Streaming SVE mode +included in SME. + +This document does not aim to describe the SME architecture or programmer's +model. To aid understanding, a minimal description of relevant programmer's +model features for SME is included in Appendix A. + + +1. General +----------- + +* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA + register state and TPIDR2_EL0 are tracked per thread. + +* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector + AT_HWCAP2 entry. Presence of this flag implies the presence of the SME + instructions and registers, and the Linux-specific system interfaces + described in this document. SME is reported in /proc/cpuinfo as "sme". + +* Support for the execution of SME instructions in userspace can also be + detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS + instruction, and checking that the value of the SME field is nonzero. [3] + + It does not guarantee the presence of the system interfaces described in the + following sections: software that needs to verify that those interfaces are + present must check for HWCAP2_SME instead. + +* There are a number of optional SME features, presence of these is reported + through AT_HWCAP2 through: + + HWCAP2_SME_I16I64 + HWCAP2_SME_F64F64 + HWCAP2_SME_I8I32 + HWCAP2_SME_F16F32 + HWCAP2_SME_B16F32 + HWCAP2_SME_F32F32 + HWCAP2_SME_FA64 + + This list may be extended over time as the SME architecture evolves. + + These extensions are also reported via the CPU ID register ID_AA64SMFR0_EL1, + which userspace can read using an MRS instruction. See elf_hwcaps.txt and + cpu-feature-registers.txt for details. + +* Debuggers should restrict themselves to interacting with the target via the + NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way + of detecting support for these regsets is to connect to a target process + first and then attempt a + + ptrace(PTRACE_GETREGSET, pid, NT_ARM_, &iov). + +* Whenever ZA register values are exchanged in memory between userspace and + the kernel, the register value is encoded in memory as a series of horizontal + vectors from 0 to VL/8-1 stored in the same endianness invariant format as is + used for SVE vectors. + +* On thread creation TPIDR2_EL0 is preserved unless CLONE_SETTLS is specified, + in which case it is set to 0. + +2. Vector lengths +------------------ + +SME defines a second vector length similar to the SVE vector length which is +controls the size of the streaming mode SVE vectors and the ZA matrix array. +The ZA matrix is square with each side having as many bytes as a SVE vector. + + +3. Sharing of streaming and non-streaming mode SVE state +--------------------------------------------------------- + +It is implementation defined which if any parts of the SVE state are shared +between streaming and non-streaming modes. When switching between modes +via software interfaces such as ptrace if no register content is provided as +part of switching no state will be assumed to be shared and everything will +be zeroed. + + +4. System call behaviour +------------------------- + +* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the + ZA matrix are preserved. + +* On syscall PSTATE.SM will be cleared and the SVE registers will be handled + as normal. + +* Neither the SVE registers nor ZA are used to pass arguments to or receive + results from any syscall. + +* On creation fork() or clone() the newly created process will have PSTATE.SM + and PSTATE.ZA cleared. + +* All other SME state of a thread, including the currently configured vector + length, the state of the PR_SME_VL_INHERIT flag, and the deferred vector + length (if any), is preserved across all syscalls, subject to the specific + exceptions for execve() described in section 6. + + +5. Signal handling +------------------- + +* Signal handlers are invoked with streaming mode and ZA disabled. + +* A new signal frame record za_context encodes the ZA register contents on + signal delivery. [1] + +* The signal frame record for ZA always contains basic metadata, in particular + the thread's vector length (in za_context.vl). + +* The ZA matrix may or may not be included in the record, depending on + the value of PSTATE.ZA. The registers are present if and only if: + za_context.head.size >= ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl)) + in which case PSTATE.ZA == 1. + +* If matrix data is present, the remainder of the record has a vl-dependent + size and layout. Macros ZA_SIG_* are defined [1] to facilitate access to + them. + +* The matrix is stored as a series of horizontal vectors in the same format as + is used for SVE vectors. + +* If the ZA context is too big to fit in sigcontext.__reserved[], then extra + space is allocated on the stack, an extra_context record is written in + __reserved[] referencing this space. za_context is then written in the + extra space. Refer to [1] for further details about this mechanism. + + +5. Signal return +----------------- + +When returning from a signal handler: + +* If there is no za_context record in the signal frame, or if the record is + present but contains no register data as desribed in the previous section, + then ZA is disabled. + +* If za_context is present in the signal frame and contains matrix data then + PSTATE.ZA is set to 1 and ZA is populated with the specified data. + +* The vector length cannot be changed via signal return. If za_context.vl in + the signal frame does not match the current vector length, the signal return + attempt is treated as illegal, resulting in a forced SIGSEGV. + + +6. prctl extensions +-------------------- + +Some new prctl() calls are added to allow programs to manage the SME vector +length: + +prctl(PR_SME_SET_VL, unsigned long arg) + + Sets the vector length of the calling thread and related flags, where + arg == vl | flags. Other threads of the calling process are unaffected. + + vl is the desired vector length, where sve_vl_valid(vl) must be true. + + flags: + + PR_SME_VL_INHERIT + + Inherit the current vector length across execve(). Otherwise, the + vector length is reset to the system default at execve(). (See + Section 9.) + + PR_SME_SET_VL_ONEXEC + + Defer the requested vector length change until the next execve() + performed by this thread. + + The effect is equivalent to implicit exceution of the following + call immediately after the next execve() (if any) by the thread: + + prctl(PR_SME_SET_VL, arg & ~PR_SME_SET_VL_ONEXEC) + + This allows launching of a new program with a different vector + length, while avoiding runtime side effects in the caller. + + Without PR_SME_SET_VL_ONEXEC, the requested change takes effect + immediately. + + + Return value: a nonnegative on success, or a negative value on error: + EINVAL: SME not supported, invalid vector length requested, or + invalid flags. + + + On success: + + * Either the calling thread's vector length or the deferred vector length + to be applied at the next execve() by the thread (dependent on whether + PR_SME_SET_VL_ONEXEC is present in arg), is set to the largest value + supported by the system that is less than or equal to vl. If vl == + SVE_VL_MAX, the value set will be the largest value supported by the + system. + + * Any previously outstanding deferred vector length change in the calling + thread is cancelled. + + * The returned value describes the resulting configuration, encoded as for + PR_SME_GET_VL. The vector length reported in this value is the new + current vector length for this thread if PR_SME_SET_VL_ONEXEC was not + present in arg; otherwise, the reported vector length is the deferred + vector length that will be applied at the next execve() by the calling + thread. + + * Changing the vector length causes all of ZA, P0..P15, FFR and all bits of + Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become + unspecified, including both streaming and non-streaming SVE state. + Calling PR_SME_SET_VL with vl equal to the thread's current vector + length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag, + does not constitute a change to the vector length for this purpose. + + * Changing the vector length causes PSTATE.ZA and PSTATE.SM to be cleared. + Calling PR_SME_SET_VL with vl equal to the thread's current vector + length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag, + does not constitute a change to the vector length for this purpose. + + +prctl(PR_SME_GET_VL) + + Gets the vector length of the calling thread. + + The following flag may be OR-ed into the result: + + PR_SME_VL_INHERIT + + Vector length will be inherited across execve(). + + There is no way to determine whether there is an outstanding deferred + vector length change (which would only normally be the case between a + fork() or vfork() and the corresponding execve() in typical use). + + To extract the vector length from the result, bitwise and it with + PR_SME_VL_LEN_MASK. + + Return value: a nonnegative value on success, or a negative value on error: + EINVAL: SME not supported. + + +7. ptrace extensions +--------------------- + +* A new regset NT_ARM_SSVE is defined for access to streaming mode SVE + state via PTRACE_GETREGSET and PTRACE_SETREGSET, this is documented in + sve.rst. + +* A new regset NT_ARM_ZA is defined for ZA state for access to ZA state via + PTRACE_GETREGSET and PTRACE_SETREGSET. + + Refer to [2] for definitions. + +The regset data starts with struct user_za_header, containing: + + size + + Size of the complete regset, in bytes. + This depends on vl and possibly on other things in the future. + + If a call to PTRACE_GETREGSET requests less data than the value of + size, the caller can allocate a larger buffer and retry in order to + read the complete regset. + + max_size + + Maximum size in bytes that the regset can grow to for the target + thread. The regset won't grow bigger than this even if the target + thread changes its vector length etc. + + vl + + Target thread's current streaming vector length, in bytes. + + max_vl + + Maximum possible streaming vector length for the target thread. + + flags + + Zero or more of the following flags, which have the same + meaning and behaviour as the corresponding PR_SET_VL_* flags: + + SME_PT_VL_INHERIT + + SME_PT_VL_ONEXEC (SETREGSET only). + +* The effects of changing the vector length and/or flags are equivalent to + those documented for PR_SME_SET_VL. + + The caller must make a further GETREGSET call if it needs to know what VL is + actually set by SETREGSET, unless is it known in advance that the requested + VL is supported. + +* The size and layout of the payload depends on the header fields. The + SME_PT_ZA_*() macros are provided to facilitate access to the data. + +* In either case, for SETREGSET it is permissible to omit the payload, in which + case the vector length and flags are changed and PSTATE.ZA is set to 0 + (along with any consequences of those changes). If a payload is provided + then PSTATE.ZA will be set to 1. + +* For SETREGSET, if the requested VL is not supported, the effect will be the + same as if the payload were omitted, except that an EIO error is reported. + No attempt is made to translate the payload data to the correct layout + for the vector length actually set. It is up to the caller to translate the + payload layout for the actual VL and retry. + +* The effect of writing a partial, incomplete payload is unspecified. + + +8. ELF coredump extensions +--------------------------- + +* NT_ARM_SSVE notes will be added to each coredump for + each thread of the dumped process. The contents will be equivalent to the + data that would have been read if a PTRACE_GETREGSET of the corresponding + type were executed for each thread when the coredump was generated. + +* A NT_ARM_ZA note will be added to each coredump for each thread of the + dumped process. The contents will be equivalent to the data that would have + been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread + when the coredump was generated. + + +9. System runtime configuration +-------------------------------- + +* To mitigate the ABI impact of expansion of the signal frame, a policy + mechanism is provided for administrators, distro maintainers and developers + to set the default vector length for userspace processes: + +/proc/sys/abi/sme_default_vector_length + + Writing the text representation of an integer to this file sets the system + default vector length to the specified value, unless the value is greater + than the maximum vector length supported by the system in which case the + default vector length is set to that maximum. + + The result can be determined by reopening the file and reading its + contents. + + At boot, the default vector length is initially set to 32 or the maximum + supported vector length, whichever is smaller and supported. This + determines the initial vector length of the init process (PID 1). + + Reading this file returns the current system default vector length. + +* At every execve() call, the new vector length of the new process is set to + the system default vector length, unless + + * PR_SME_VL_INHERIT (or equivalently SME_PT_VL_INHERIT) is set for the + calling thread, or + + * a deferred vector length change is pending, established via the + PR_SME_SET_VL_ONEXEC flag (or SME_PT_VL_ONEXEC). + +* Modifying the system default vector length does not affect the vector length + of any existing process or thread that does not make an execve() call. + + +Appendix A. SME programmer's model (informative) +================================================= + +This section provides a minimal description of the additions made by SVE to the +ARMv8-A programmer's model that are relevant to this document. + +Note: This section is for information only and not intended to be complete or +to replace any architectural specification. + +A.1. Registers +--------------- + +In A64 state, SME adds the following: + +* A new mode, streaming mode, in which a subset of the normal FPSIMD and SVE + features are available. When supported EL0 software may enter and leave + streaming mode at any time. + + For best system performance it is strongly encouraged for software to enable + streaming mode only when it is actively being used. + +* A new vector length controlling the size of ZA and the Z registers when in + streaming mode, separately to the vector length used for SVE when not in + streaming mode. There is no requirement that either the currently selected + vector length or the set of vector lengths supported for the two modes in + a given system have any relationship. The streaming mode vector length + is referred to as SVL. + +* A new ZA matrix register. This is a square matrix of SVLxSVL bits. Most + operations on ZA require that streaming mode be enabled but ZA can be + enabled without streaming mode in order to load, save and retain data. + + For best system performance it is strongly encouraged for software to enable + ZA only when it is actively being used. + +* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and + SMSTOP instructions or by access to the SVCR system register: + + * PSTATE.ZA, if this is 1 then the ZA matrix is accessible and has valid + data while if it is 0 then ZA can not be accessed. When PSTATE.ZA is + changed from 0 to 1 all bits in ZA are cleared. + + * PSTATE.SM, if this is 1 then the PE is in streaming mode. When the value + of PSTATE.SM is changed then it is implementation defined if the subset + of the floating point register bits valid in both modes may be retained. + Any other bits will be cleared. + + +References +========== + +[1] arch/arm64/include/uapi/asm/sigcontext.h + AArch64 Linux signal ABI definitions + +[2] arch/arm64/include/uapi/asm/ptrace.h + AArch64 Linux ptrace ABI definitions + +[3] Documentation/arm64/cpu-feature-registers.rst + +[4] ARM IHI0055C + http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf + http://infocenter.arm.com/help/topic/com.arm.doc.subset.swdev.abi/index.html + Procedure Call Standard for the ARM 64-bit Architecture (AArch64) diff --git a/Documentation/arm64/sve.rst b/Documentation/arm64/sve.rst index 9d9a4de5bc34..93c2c2990584 100644 --- a/Documentation/arm64/sve.rst +++ b/Documentation/arm64/sve.rst @@ -7,7 +7,9 @@ Author: Dave Martin Date: 4 August 2017 This document outlines briefly the interface provided to userspace by Linux in -order to support use of the ARM Scalable Vector Extension (SVE). +order to support use of the ARM Scalable Vector Extension (SVE), including +interactions with Streaming SVE mode added by the Scalable Matrix Extension +(SME). This is an outline of the most important features and issues only and not intended to be exhaustive. @@ -23,6 +25,10 @@ model features for SVE is included in Appendix A. * SVE registers Z0..Z31, P0..P15 and FFR and the current vector length VL, are tracked per-thread. +* In streaming mode FFR is not accessible unless HWCAP2_SME_FA64 is present + in the system, when it is not supported and these interfaces are used to + access streaming mode FFR is read and written as zero. + * The presence of SVE is reported to userspace via HWCAP_SVE in the aux vector AT_HWCAP entry. Presence of this flag implies the presence of the SVE instructions and registers, and the Linux-specific system interfaces @@ -53,10 +59,19 @@ model features for SVE is included in Appendix A. which userspace can read using an MRS instruction. See elf_hwcaps.txt and cpu-feature-registers.txt for details. +* On hardware that supports the SME extensions, HWCAP2_SME will also be + reported in the AT_HWCAP2 aux vector entry. Among other things SME adds + streaming mode which provides a subset of the SVE feature set using a + separate SME vector length and the same Z/V registers. See sme.rst + for more details. + * Debuggers should restrict themselves to interacting with the target via the NT_ARM_SVE regset. The recommended way of detecting support for this regset is to connect to a target process first and then attempt a - ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov). + ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov). Note that when SME is + present and streaming SVE mode is in use the FPSIMD subset of registers + will be read via NT_ARM_SVE and NT_ARM_SVE writes will exit streaming mode + in the target. * Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory between userspace and the kernel, the register value is encoded in memory in @@ -126,6 +141,11 @@ the SVE instruction set architecture. are only present in fpsimd_context. For convenience, the content of V0..V31 is duplicated between sve_context and fpsimd_context. +* The record contains a flag field which includes a flag SVE_SIG_FLAG_SM which + if set indicates that the thread is in streaming mode and the vector length + and register data (if present) describe the streaming SVE data and vector + length. + * The signal frame record for SVE always contains basic metadata, in particular the thread's vector length (in sve_context.vl). @@ -170,6 +190,11 @@ When returning from a signal handler: the signal frame does not match the current vector length, the signal return attempt is treated as illegal, resulting in a forced SIGSEGV. +* It is permitted to enter or leave streaming mode by setting or clearing + the SVE_SIG_FLAG_SM flag but applications should take care to ensure that + when doing so sve_context.vl and any register data are appropriate for the + vector length in the new mode. + 6. prctl extensions -------------------- @@ -265,8 +290,14 @@ prctl(PR_SVE_GET_VL) 7. ptrace extensions --------------------- -* A new regset NT_ARM_SVE is defined for use with PTRACE_GETREGSET and - PTRACE_SETREGSET. +* New regsets NT_ARM_SVE and NT_ARM_SSVE are defined for use with + PTRACE_GETREGSET and PTRACE_SETREGSET. NT_ARM_SSVE describes the + streaming mode SVE registers and NT_ARM_SVE describes the + non-streaming mode SVE registers. + + In this description a register set is referred to as being "live" when + the target is in the appropriate streaming or non-streaming mode and is + using data beyond the subset shared with the FPSIMD Vn registers. Refer to [2] for definitions. @@ -297,7 +328,7 @@ The regset data starts with struct user_sve_header, containing: flags - either + at most one of SVE_PT_REGS_FPSIMD @@ -331,6 +362,10 @@ The regset data starts with struct user_sve_header, containing: SVE_PT_VL_ONEXEC (SETREGSET only). + If neither FPSIMD nor SVE flags are provided then no register + payload is available, this is only possible when SME is implemented. + + * The effects of changing the vector length and/or flags are equivalent to those documented for PR_SVE_SET_VL. @@ -346,6 +381,13 @@ The regset data starts with struct user_sve_header, containing: case only the vector length and flags are changed (along with any consequences of those changes). +* In systems supporting SME when in streaming mode a GETREGSET for + NT_REG_SVE will return only the user_sve_header with no register data, + similarly a GETREGSET for NT_REG_SSVE will not return any register data + when not in streaming mode. + +* A GETREGSET for NT_ARM_SSVE will never return SVE_PT_REGS_FPSIMD. + * For SETREGSET, if an SVE_PT_REGS_SVE payload is present and the requested VL is not supported, the effect will be the same as if the payload were omitted, except that an EIO error is reported. No @@ -355,17 +397,25 @@ The regset data starts with struct user_sve_header, containing: unspecified. It is up to the caller to translate the payload layout for the actual VL and retry. +* Where SME is implemented it is not possible to GETREGSET the register + state for normal SVE when in streaming mode, nor the streaming mode + register state when in normal mode, regardless of the implementation defined + behaviour of the hardware for sharing data between the two modes. + +* Any SETREGSET of NT_ARM_SVE will exit streaming mode if the target was in + streaming mode and any SETREGSET of NT_ARM_SSVE will enter streaming mode + if the target was not in streaming mode. + * The effect of writing a partial, incomplete payload is unspecified. 8. ELF coredump extensions --------------------------- -* A NT_ARM_SVE note will be added to each coredump for each thread of the - dumped process. The contents will be equivalent to the data that would have - been read if a PTRACE_GETREGSET of NT_ARM_SVE were executed for each thread - when the coredump was generated. - +* NT_ARM_SVE and NT_ARM_SSVE notes will be added to each coredump for + each thread of the dumped process. The contents will be equivalent to the + data that would have been read if a PTRACE_GETREGSET of the corresponding + type were executed for each thread when the coredump was generated. 9. System runtime configuration -------------------------------- From patchwork Wed Jan 26 15:10:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7714C28CF5 for ; Wed, 26 Jan 2022 15:11:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242595AbiAZPL7 (ORCPT ); Wed, 26 Jan 2022 10:11:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242594AbiAZPL7 (ORCPT ); Wed, 26 Jan 2022 10:11:59 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E9A5C06161C for ; Wed, 26 Jan 2022 07:11:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B2FAF6187F for ; Wed, 26 Jan 2022 15:11:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B3498C340EB; Wed, 26 Jan 2022 15:11:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209918; bh=tClwxBxXeStlBQhw570CauUAMkDJNDdPwv6KEDRDvAE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mXs63VOmTJWzfrtJqeJneHUK3pqQRem+ATQSgSNZzcTKin8W/neJzZErpGz/x70lB syGqRNdcN/LBHVylg0HEeKOjFJzeqL5R99cy9zeDOYX6y0CReNYV6s4Rev2V9I6W8e K843Gbe9DbpPxTHy88nr85bowciQ3gMAtOvnEeV4HFZiwEDnT/wzpIeEbLPXAnqBdK cdiJwXfV7wKwhVMYKwm9ntjnVwPJMovx1eKdmQkYLRzhH+JBzmqi9910ltHhKPlp9g BZf+Ko3izSidqa386qigFAuLJnQzaLpKCRgVVOpPtoDUPP4nHse93FWL4p2Ow38KYS DHByuFwMtjDVA== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 06/40] arm64/sme: System register and exception syndrome definitions Date: Wed, 26 Jan 2022 15:10:46 +0000 Message-Id: <20220126151120.3811248-7-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=9085; h=from:subject; bh=tClwxBxXeStlBQhw570CauUAMkDJNDdPwv6KEDRDvAE=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR7Dx0r2L48QqKPssZegcV0F8B7481zG5jsAeZU saYOs0mJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkewAKCRAk1otyXVSH0OrXB/ 4ktXXQKt6GWRp7OjRTW7hjfPfQqqQ5z9Op0DGOOX44eA+M0wDUopZEKU2UfoE5yChCuDNb3D3PdCHN 3THUlKJf/DTR8FYo/5i8ePrtIknX1TB0KHvs+IshHYtCAd4jdyMkHwQSovtueLnvVkAlXwfu+gaZfn S41D7N4Ss8UjwUPXdnaxC+OFZP9yEAX77638MkRhUiPaP+A+f5xRBRrCfg7c/wLb3GsGQC9i5EFQVT JFj/TEv/t5NCi0ECfjt0vzznK5DhrKa6umMzo2pCKLEW9sO57TW25HWjz/IQcN9eVUvdB6sKpDR16S 5v/NNmEfJbYvXMCy0lYB3QQB4h1BdW X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The arm64 Scalable Matrix Extension (SME) adds some new system registers, fields in existing system registers and exception syndromes. This patch adds definitions for these for use in future patches implementing support for this extension. Since SME will be the first user of FEAT_HCX in the kernel also include the definitions for enumerating it and the HCRX system register it adds. Signed-off-by: Mark Brown --- arch/arm64/include/asm/esr.h | 12 +++++- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/sysreg.h | 63 ++++++++++++++++++++++++++++++++ arch/arm64/kernel/traps.c | 1 + 4 files changed, 76 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index d52a0b269ee8..43872e0cfd1e 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -37,7 +37,8 @@ #define ESR_ELx_EC_ERET (0x1a) /* EL2 only */ /* Unallocated EC: 0x1B */ #define ESR_ELx_EC_FPAC (0x1C) /* EL1 and above */ -/* Unallocated EC: 0x1D - 0x1E */ +#define ESR_ELx_EC_SME (0x1D) +/* Unallocated EC: 0x1E */ #define ESR_ELx_EC_IMP_DEF (0x1f) /* EL3 only */ #define ESR_ELx_EC_IABT_LOW (0x20) #define ESR_ELx_EC_IABT_CUR (0x21) @@ -327,6 +328,15 @@ #define ESR_ELx_CP15_32_ISS_SYS_CNTFRQ (ESR_ELx_CP15_32_ISS_SYS_VAL(0, 0, 14, 0) |\ ESR_ELx_CP15_32_ISS_DIR_READ) +/* + * ISS values for SME traps + */ + +#define ESR_ELx_SME_ISS_SME_DISABLED 0 +#define ESR_ELx_SME_ISS_ILL 1 +#define ESR_ELx_SME_ISS_SM_DISABLED 2 +#define ESR_ELx_SME_ISS_ZA_DISABLED 3 + #ifndef __ASSEMBLY__ #include diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 1767ded83888..13ae232ec4a1 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -279,6 +279,7 @@ #define CPTR_EL2_TCPAC (1U << 31) #define CPTR_EL2_TAM (1 << 30) #define CPTR_EL2_TTA (1 << 20) +#define CPTR_EL2_TSM (1 << 12) #define CPTR_EL2_TFP (1 << CPTR_EL2_TFP_SHIFT) #define CPTR_EL2_TZ (1 << 8) #define CPTR_NVHE_EL2_RES1 0x000032ff /* known RES1 bits in CPTR_EL2 (nVHE) */ diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index e66dd9ebc337..6056c9aaab28 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -173,6 +173,7 @@ #define SYS_ID_AA64PFR0_EL1 sys_reg(3, 0, 0, 4, 0) #define SYS_ID_AA64PFR1_EL1 sys_reg(3, 0, 0, 4, 1) #define SYS_ID_AA64ZFR0_EL1 sys_reg(3, 0, 0, 4, 4) +#define SYS_ID_AA64SMFR0_EL1 sys_reg(3, 0, 0, 4, 5) #define SYS_ID_AA64DFR0_EL1 sys_reg(3, 0, 0, 5, 0) #define SYS_ID_AA64DFR1_EL1 sys_reg(3, 0, 0, 5, 1) @@ -196,6 +197,8 @@ #define SYS_ZCR_EL1 sys_reg(3, 0, 1, 2, 0) #define SYS_TRFCR_EL1 sys_reg(3, 0, 1, 2, 1) +#define SYS_SMPRI_EL1 sys_reg(3, 0, 1, 2, 4) +#define SYS_SMCR_EL1 sys_reg(3, 0, 1, 2, 6) #define SYS_TTBR0_EL1 sys_reg(3, 0, 2, 0, 0) #define SYS_TTBR1_EL1 sys_reg(3, 0, 2, 0, 1) @@ -388,6 +391,8 @@ #define TRBIDR_ALIGN_MASK GENMASK(3, 0) #define TRBIDR_ALIGN_SHIFT 0 +#define SMPRI_EL1_PRIORITY_MASK 0xf + #define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1) #define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2) @@ -443,8 +448,13 @@ #define SYS_CCSIDR_EL1 sys_reg(3, 1, 0, 0, 0) #define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1) #define SYS_GMID_EL1 sys_reg(3, 1, 0, 0, 4) +#define SYS_SMIDR_EL1 sys_reg(3, 1, 0, 0, 6) #define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7) +#define SYS_SMIDR_EL1_IMPLEMENTER_SHIFT 24 +#define SYS_SMIDR_EL1_SMPS_SHIFT 15 +#define SYS_SMIDR_EL1_AFFINITY_SHIFT 0 + #define SYS_CSSELR_EL1 sys_reg(3, 2, 0, 0, 0) #define SYS_CTR_EL0 sys_reg(3, 3, 0, 0, 1) @@ -453,6 +463,10 @@ #define SYS_RNDR_EL0 sys_reg(3, 3, 2, 4, 0) #define SYS_RNDRRS_EL0 sys_reg(3, 3, 2, 4, 1) +#define SYS_SVCR_EL0 sys_reg(3, 3, 4, 2, 2) +#define SYS_SVCR_EL0_ZA_MASK 2 +#define SYS_SVCR_EL0_SM_MASK 1 + #define SYS_PMCR_EL0 sys_reg(3, 3, 9, 12, 0) #define SYS_PMCNTENSET_EL0 sys_reg(3, 3, 9, 12, 1) #define SYS_PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2) @@ -469,6 +483,7 @@ #define SYS_TPIDR_EL0 sys_reg(3, 3, 13, 0, 2) #define SYS_TPIDRRO_EL0 sys_reg(3, 3, 13, 0, 3) +#define SYS_TPIDR2_EL0 sys_reg(3, 3, 13, 0, 5) #define SYS_SCXTNUM_EL0 sys_reg(3, 3, 13, 0, 7) @@ -538,6 +553,9 @@ #define SYS_HFGITR_EL2 sys_reg(3, 4, 1, 1, 6) #define SYS_ZCR_EL2 sys_reg(3, 4, 1, 2, 0) #define SYS_TRFCR_EL2 sys_reg(3, 4, 1, 2, 1) +#define SYS_HCRX_EL2 sys_reg(3, 4, 1, 2, 2) +#define SYS_SMPRIMAP_EL2 sys_reg(3, 4, 1, 2, 5) +#define SYS_SMCR_EL2 sys_reg(3, 4, 1, 2, 6) #define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0) #define SYS_HDFGRTR_EL2 sys_reg(3, 4, 3, 1, 4) #define SYS_HDFGWTR_EL2 sys_reg(3, 4, 3, 1, 5) @@ -597,6 +615,7 @@ #define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0) #define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2) #define SYS_ZCR_EL12 sys_reg(3, 5, 1, 2, 0) +#define SYS_SMCR_EL12 sys_reg(3, 5, 1, 2, 6) #define SYS_TTBR0_EL12 sys_reg(3, 5, 2, 0, 0) #define SYS_TTBR1_EL12 sys_reg(3, 5, 2, 0, 1) #define SYS_TCR_EL12 sys_reg(3, 5, 2, 0, 2) @@ -620,6 +639,7 @@ #define SYS_CNTV_CVAL_EL02 sys_reg(3, 5, 14, 3, 2) /* Common SCTLR_ELx flags. */ +#define SCTLR_ELx_ENTP2 (BIT(60)) #define SCTLR_ELx_DSSBS (BIT(44)) #define SCTLR_ELx_ATA (BIT(43)) @@ -815,6 +835,7 @@ #define ID_AA64PFR0_ELx_32BIT_64BIT 0x2 /* id_aa64pfr1 */ +#define ID_AA64PFR1_SME_SHIFT 24 #define ID_AA64PFR1_MPAMFRAC_SHIFT 16 #define ID_AA64PFR1_RASFRAC_SHIFT 12 #define ID_AA64PFR1_MTE_SHIFT 8 @@ -825,6 +846,7 @@ #define ID_AA64PFR1_SSBS_PSTATE_ONLY 1 #define ID_AA64PFR1_SSBS_PSTATE_INSNS 2 #define ID_AA64PFR1_BT_BTI 0x1 +#define ID_AA64PFR1_SME 1 #define ID_AA64PFR1_MTE_NI 0x0 #define ID_AA64PFR1_MTE_EL0 0x1 @@ -853,6 +875,23 @@ #define ID_AA64ZFR0_AES_PMULL 0x2 #define ID_AA64ZFR0_SVEVER_SVE2 0x1 +/* id_aa64smfr0 */ +#define ID_AA64SMFR0_FA64_SHIFT 63 +#define ID_AA64SMFR0_I16I64_SHIFT 52 +#define ID_AA64SMFR0_F64F64_SHIFT 48 +#define ID_AA64SMFR0_I8I32_SHIFT 36 +#define ID_AA64SMFR0_F16F32_SHIFT 35 +#define ID_AA64SMFR0_B16F32_SHIFT 34 +#define ID_AA64SMFR0_F32F32_SHIFT 32 + +#define ID_AA64SMFR0_FA64 0x1 +#define ID_AA64SMFR0_I16I64 0x4 +#define ID_AA64SMFR0_F64F64 0x1 +#define ID_AA64SMFR0_I8I32 0x4 +#define ID_AA64SMFR0_F16F32 0x1 +#define ID_AA64SMFR0_B16F32 0x1 +#define ID_AA64SMFR0_F32F32 0x1 + /* id_aa64mmfr0 */ #define ID_AA64MMFR0_ECV_SHIFT 60 #define ID_AA64MMFR0_FGT_SHIFT 56 @@ -904,6 +943,7 @@ #endif /* id_aa64mmfr1 */ +#define ID_AA64MMFR1_HCX_SHIFT 40 #define ID_AA64MMFR1_AFP_SHIFT 44 #define ID_AA64MMFR1_ETS_SHIFT 36 #define ID_AA64MMFR1_TWED_SHIFT 32 @@ -1097,9 +1137,24 @@ #define ZCR_ELx_LEN_SIZE 9 #define ZCR_ELx_LEN_MASK 0x1ff +#define SMCR_ELx_FA64_SHIFT 31 +#define SMCR_ELx_FA64_MASK (1 << SMCR_ELx_FA64_SHIFT) + +/* + * The SMCR_ELx_LEN_* definitions intentionally include bits [8:4] which + * are reserved by the SME architecture for future expansion of the LEN + * field, with compatible semantics. + */ +#define SMCR_ELx_LEN_SHIFT 0 +#define SMCR_ELx_LEN_SIZE 9 +#define SMCR_ELx_LEN_MASK 0x1ff + #define CPACR_EL1_FPEN_EL1EN (BIT(20)) /* enable EL1 access */ #define CPACR_EL1_FPEN_EL0EN (BIT(21)) /* enable EL0 access, if EL1EN set */ +#define CPACR_EL1_SMEN_EL1EN (BIT(24)) /* enable EL1 access */ +#define CPACR_EL1_SMEN_EL0EN (BIT(25)) /* enable EL0 access, if EL1EN set */ + #define CPACR_EL1_ZEN_EL1EN (BIT(16)) /* enable EL1 access */ #define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */ @@ -1152,6 +1207,8 @@ #define TRFCR_ELx_ExTRE BIT(1) #define TRFCR_ELx_E0TRE BIT(0) +/* HCRX_EL2 definitions */ +#define HCRX_EL2_SMPME_MASK (1 << 5) /* GIC Hypervisor interface registers */ /* ICH_MISR_EL2 bit definitions */ @@ -1215,6 +1272,12 @@ #define ICH_VTR_TDS_SHIFT 19 #define ICH_VTR_TDS_MASK (1 << ICH_VTR_TDS_SHIFT) +/* HFG[WR]TR_EL2 bit definitions */ +#define HFGxTR_EL2_nTPIDR2_EL0_SHIFT 55 +#define HFGxTR_EL2_nTPIDR2_EL0_MASK BIT_MASK(HFGxTR_EL2_nTPIDR2_EL0_SHIFT) +#define HFGxTR_EL2_nSMPRI_EL1_SHIFT 54 +#define HFGxTR_EL2_nSMPRI_EL1_MASK BIT_MASK(HFGxTR_EL2_nSMPRI_EL1_SHIFT) + #define ARM64_FEATURE_FIELD_BITS 4 /* Create a mask for the feature bits of the specified feature. */ diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 70fc42470f13..88effb25c399 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -822,6 +822,7 @@ static const char *esr_class_str[] = { [ESR_ELx_EC_SVE] = "SVE", [ESR_ELx_EC_ERET] = "ERET/ERETAA/ERETAB", [ESR_ELx_EC_FPAC] = "FPAC", + [ESR_ELx_EC_SME] = "SME", [ESR_ELx_EC_IMP_DEF] = "EL3 IMP DEF", [ESR_ELx_EC_IABT_LOW] = "IABT (lower EL)", [ESR_ELx_EC_IABT_CUR] = "IABT (current EL)", From patchwork Wed Jan 26 15:10:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70B8DC3526D for ; Wed, 26 Jan 2022 15:12:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242594AbiAZPMD (ORCPT ); Wed, 26 Jan 2022 10:12:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242593AbiAZPMD (ORCPT ); Wed, 26 Jan 2022 10:12:03 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E89AC06161C for ; Wed, 26 Jan 2022 07:12:03 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 952C561446 for ; Wed, 26 Jan 2022 15:12:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 97690C340E3; Wed, 26 Jan 2022 15:11:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209922; bh=HMQHyHN36InZca7SOj753D+ZgRaa/ByOBuanPgEEWxc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JRLyk0RrQZ9+rVxuLB43zvOt07MPsgCQBUE7dZPvO4cvnNEDq/XAgma57E3oPcwjI lQZohNIpW99WsTLFmrznfH/ahZHq4tOWNdrhWZpEDp19zYNiNoqJ2NpWUvYst2v818 kEdPDvFjIFv5qsJhUFiil1tqF/dOeiGsI5If6u+fmtKD9w8bAIpsF1LRH+CBeSCNEK Vz4U6HoVqS7t9oRKA3p38IvLJsGWhKg+EovTS6VB5faOUv/e7n3qm0Sqs/V7wRen+L UaiVxOv4pqG517lIMoCD9XltWugEfHPUFdmBZHVuUlupge9iUz5MxxIDy2WTvmXF2K wOuO6qY441fYg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 07/40] arm64/sme: Manually encode SME instructions Date: Wed, 26 Jan 2022 15:10:47 +0000 Message-Id: <20220126151120.3811248-8-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3671; h=from:subject; bh=HMQHyHN36InZca7SOj753D+ZgRaa/ByOBuanPgEEWxc=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR8U4t0P2TtzTYSd2s2Hen3CRw27ngQIyNfDOJC 4/nuJxWJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkfAAKCRAk1otyXVSH0GdlB/ 4oKS5E36UbMS2Gr5pcNTJpXfJKqNG/OPJe3K15rM24irlgMZ2rIfRL5Q5aReW/TThySCO815QyK+Zi nQGq9yfvtqmALqMrms2vo6v0yjQxtUthedws3zD4Pi36fOfozGq5G3w1QPYW8poy+axo3FRZIejym2 sLzC1KEHE5TUKkSqGrkzRgjNGigcA4Oa4RZS7DYaFqwYhjNYBPIucUp+TpiqPQh/YuL0U48y88+HCx SScW/qZdtqVZbtj1PDkvXJqkByTLRMb6AjIB0eyHB09JDH6Y42hGDbtazCI1Ic9Wn6AtF0LE+EcM61 KrUWravhZRowxo95x84tVdWXqth6B5 X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org As with SVE rather than impose ambitious toolchain requirements for SME we manually encode the few instructions which we require in order to perform the work the kernel needs to do. The instructions used to save and restore context are provided as assembler macros while those for entering and leaving streaming mode are done in asm volatile blocks since they are expected to be used from C. We could do the SMSTART and SMSTOP operations with read/modify/write cycles on SVCR but using the aliases provided for individual field accesses should be slightly faster. These instructions are aliases for MSR but since our minimum toolchain requirements are old enough to mean that we can't use the sX_X_cX_cX_X form and they always use xzr rather than taking a value like write_sysreg_s() wants we just use .inst. Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 25 +++++++++++++ arch/arm64/include/asm/fpsimdmacros.h | 53 +++++++++++++++++++++++++++ 2 files changed, 78 insertions(+) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index cb24385e3632..c90f7f99a768 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -249,6 +249,31 @@ static inline void sve_setup(void) { } #endif /* ! CONFIG_ARM64_SVE */ +#ifdef CONFIG_ARM64_SME + +static inline void sme_smstart_sm(void) +{ + asm volatile(".inst 0xd503437f"); +} + +static inline void sme_smstop_sm(void) +{ + asm volatile(".inst 0xd503427f"); +} + +static inline void sme_smstop(void) +{ + asm volatile(".inst 0xd503467f"); +} + +#else + +static inline void sme_smstart_sm(void) { } +static inline void sme_smstop_sm(void) { } +static inline void sme_smstop(void) { } + +#endif /* ! CONFIG_ARM64_SME */ + /* For use by EFI runtime services calls only */ extern void __efi_fpsimd_begin(void); extern void __efi_fpsimd_end(void); diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h index 2509d7dde55a..11c426ddd62c 100644 --- a/arch/arm64/include/asm/fpsimdmacros.h +++ b/arch/arm64/include/asm/fpsimdmacros.h @@ -93,6 +93,12 @@ .endif .endm +.macro _sme_check_wv v + .if (\v) < 12 || (\v) > 15 + .error "Bad vector select register \v." + .endif +.endm + /* SVE instruction encodings for non-SVE-capable assemblers */ /* (pre binutils 2.28, all kernel capable clang versions support SVE) */ @@ -174,6 +180,53 @@ | (\np) .endm +/* SME instruction encodings for non-SME-capable assemblers */ + +/* RDSVL X\nx, #\imm */ +.macro _sme_rdsvl nx, imm + _check_general_reg \nx + _check_num (\imm), -0x20, 0x1f + .inst 0x04bf5800 \ + | (\nx) \ + | (((\imm) & 0x3f) << 5) +.endm + +/* + * STR (vector from ZA array): + * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL] + */ +.macro _sme_str_zav nw, nxbase, offset=0 + _sme_check_wv \nw + _check_general_reg \nxbase + _check_num (\offset), -0x100, 0xff + .inst 0xe1200000 \ + | (((\nw) & 3) << 13) \ + | ((\nxbase) << 5) \ + | ((\offset) & 7) +.endm + +/* + * LDR (vector to ZA array): + * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL] + */ +.macro _sme_ldr_zav nw, nxbase, offset=0 + _sme_check_wv \nw + _check_general_reg \nxbase + _check_num (\offset), -0x100, 0xff + .inst 0xe1000000 \ + | (((\nw) & 3) << 13) \ + | ((\nxbase) << 5) \ + | ((\offset) & 7) +.endm + +/* + * Zero the entire ZA array + * ZERO ZA + */ +.macro zero_za + .inst 0xc00800ff +.endm + .macro __for from:req, to:req .if (\from) == (\to) _for__body %\from From patchwork Wed Jan 26 15:10:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2CFDC28CF5 for ; Wed, 26 Jan 2022 15:12:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235782AbiAZPMI (ORCPT ); Wed, 26 Jan 2022 10:12:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPMH (ORCPT ); Wed, 26 Jan 2022 10:12:07 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42867C06161C for ; Wed, 26 Jan 2022 07:12:07 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CDB9B61463 for ; Wed, 26 Jan 2022 15:12:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81720C340E9; Wed, 26 Jan 2022 15:12:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209926; bh=AWzaL5c8DPTrwsTCJ36oC2PZV7XOSQ8peibKL+aPCWM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=f/d49URjsXrmI9y0aWqPlr/dCpdbf34f7EHxaLJIfN49wXpxE5r2XJgsebfVU2D8j tMa1kyVJHwYJeO5X1WxzdiTjVpCYdpfhKxGk3rJbyyOe8EFXxaCfOzdBdV9xGUFLQk aL0LStki1yDje9ZnW54N0V6ypjR9gIwL0YrIQTYvxhv/SKDskWN8fcIqF3f5SBceFB f1/ESch+U2oFT+RT6Bqvdo3Qnly4OEM55BvB6GOdR83CWie9+Dird7g9U9pz26HSN9 0Wgk+oqnQLRPiDYoIxj8Qw26JGr07Ng0+AThUn9JQmW2f6YYCD46ODlNIZDPs2EjHG A1dDXtkSrzTuQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 08/40] arm64/sme: Early CPU setup for SME Date: Wed, 26 Jan 2022 15:10:48 +0000 Message-Id: <20220126151120.3811248-9-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3314; h=from:subject; bh=AWzaL5c8DPTrwsTCJ36oC2PZV7XOSQ8peibKL+aPCWM=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR8qDuewMaQ7evrpewaZgROdkgtfGtwQQOcqNY9 4t+rWRqJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkfAAKCRAk1otyXVSH0JIiB/ oDomIda2grrRMkiGxD9ZuqDteLUjmV5eJIdpo6otJafL6Dgfzw/Cf3orWhcjB/3QZzZoKeCuZYe8Lg 6SsGf1FXyF7R0yHzpg+YqjCpSI7I/qWRi1K+kIC/UMC6OSKhqBHx5Uh6BIvvggHsCXqZl2Uern4Ux2 xX+BufJjq0GSMpGEi+KuIKKJhUZ4fjYl1oir6b2cvRU93ecbWVXSlJaI+XkkcyOo5oAAlWcW7lXza5 kumejDyQ/SMApW+0GYENe2Mp/cCiLfkxufaGHZJ5qVleuWbdeOQ5Tulj9432RbYn0vOrFeBVlgUViC KyOI5FDUsbS0wCH5nrsNotChTES6hE X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org SME requires similar setup to that for SVE: disable traps to EL2 and make sure that the maximum vector length is available to EL1, for SME we have two traps - one for SME itself and one for TPIDR2. In addition since we currently make no active use of priority control for SCMUs we map all SME priorities lower ELs may configure to 0, the architecture specified minimum priority, to ensure that nothing we manage is able to configure itself to consume excessive resources. This will need to be revisited should there be a need to manage SME priorities at runtime. Signed-off-by: Mark Brown --- arch/arm64/include/asm/el2_setup.h | 64 ++++++++++++++++++++++++++++-- 1 file changed, 60 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 3198acb2aad8..31f1a69c9dd2 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -143,6 +143,50 @@ .Lskip_sve_\@: .endm +/* SME register access and priority mapping */ +.macro __init_el2_nvhe_sme + mrs x1, id_aa64pfr1_el1 + ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4 + cbz x1, .Lskip_sme_\@ + + bic x0, x0, #CPTR_EL2_TSM // Also disable SME traps + msr cptr_el2, x0 // Disable copro. traps to EL2 + isb + + mrs x1, sctlr_el2 + orr x1, x1, #SCTLR_ELx_ENTP2 // Disable TPIDR2 traps + msr sctlr_el2, x1 + isb + + mov x1, #0 // SMCR controls + + mrs_s x2, SYS_ID_AA64SMFR0_EL1 + ubfx x2, x2, #ID_AA64SMFR0_FA64_SHIFT, #1 // Full FP in SM? + cbz x2, .Lskip_sme_fa64_\@ + + orr x1, x1, SMCR_ELx_FA64_MASK +.Lskip_sme_fa64_\@: + + orr x1, x1, #SMCR_ELx_LEN_MASK // Enable full SME vector + msr_s SYS_SMCR_EL2, x1 // length for EL1. + + mrs_s x1, SYS_SMIDR_EL1 // Priority mapping supported? + ubfx x1, x1, #SYS_SMIDR_EL1_SMPS_SHIFT, #1 + cbz x1, .Lskip_sme_\@ + + msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal + + mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present? + ubfx x1, x1, #ID_AA64MMFR1_HCX_SHIFT, #4 + cbz x1, .Lskip_sme_\@ + + mrs_s x1, SYS_HCRX_EL2 + orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping + msr_s SYS_HCRX_EL2, x1 + +.Lskip_sme_\@: +.endm + /* Disable any fine grained traps */ .macro __init_el2_fgt mrs x1, id_aa64mmfr0_el1 @@ -153,15 +197,26 @@ mrs x1, id_aa64dfr0_el1 ubfx x1, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4 cmp x1, #3 - b.lt .Lset_fgt_\@ + b.lt .Lset_debug_fgt_\@ /* Disable PMSNEVFR_EL1 read and write traps */ orr x0, x0, #(1 << 62) -.Lset_fgt_\@: +.Lset_debug_fgt_\@: msr_s SYS_HDFGRTR_EL2, x0 msr_s SYS_HDFGWTR_EL2, x0 - msr_s SYS_HFGRTR_EL2, xzr - msr_s SYS_HFGWTR_EL2, xzr + + mov x0, xzr + mrs x1, id_aa64pfr1_el1 + ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4 + cbz x1, .Lset_fgt_\@ + + /* Disable nVHE traps of TPIDR2 and SMPRI */ + orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK + orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK + +.Lset_fgt_\@: + msr_s SYS_HFGRTR_EL2, x0 + msr_s SYS_HFGWTR_EL2, x0 msr_s SYS_HFGITR_EL2, xzr mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU @@ -196,6 +251,7 @@ __init_el2_nvhe_idregs __init_el2_nvhe_cptr __init_el2_nvhe_sve + __init_el2_nvhe_sme __init_el2_fgt __init_el2_nvhe_prepare_eret .endm From patchwork Wed Jan 26 15:10:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66253C2BA4C for ; Wed, 26 Jan 2022 15:12:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235742AbiAZPMM (ORCPT ); Wed, 26 Jan 2022 10:12:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPML (ORCPT ); Wed, 26 Jan 2022 10:12:11 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60D6AC06161C for ; Wed, 26 Jan 2022 07:12:11 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E4DF661376 for ; Wed, 26 Jan 2022 15:12:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B1BC6C340EC; Wed, 26 Jan 2022 15:12:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209930; bh=os1IFQ/jEic4JHq3OV+Wd1dOJrRDJlwZrLQBQNu+rzo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MCA3xbfi4zUse3zPJr8zirFT+QvkzLZrUWYK5vpkd/V9hLVUr78l7ap3B4XimUhVR ngDFL66NpGeNEc9MOZIVsw+j3s15BoT12hv/v+A3rkSbEFLEow3esxt5ZVgBnE/d20 kJYx9GMlEZsTMQ6J/outJkQTN/+QoUT+/fTjQiHWBJtV0Uyp7VybHPVR3V/IXb1yHO xApIIxIrtuBceI/iZrygAqHzDfZtYvVUcxgqaX44zlC7M+dQwUZtsJ2S/JxDz3mkD2 GRcrak9QIgXTyoQRwWmtEQ/qZnPnckOiRUx7jPLpmHGJTvvC1DgpFfqikfjNAFQK5y HiFTQVM3cLxtQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 09/40] arm64/sme: Basic enumeration support Date: Wed, 26 Jan 2022 15:10:49 +0000 Message-Id: <20220126151120.3811248-10-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=13377; h=from:subject; bh=os1IFQ/jEic4JHq3OV+Wd1dOJrRDJlwZrLQBQNu+rzo=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR9O6zLSLUpErjcYA/5RbAU5Yq0BKz8PARdoJj8 GMZ0xyuJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkfQAKCRAk1otyXVSH0AdFB/ 96UDorZMlUhPr8xKIyms4mLDNMz4+CoI+g937myeZdryqy/pjWNPpDVVOpm1kx0jrgrW/vxgMUqleM CP9Xc6r13/F7Tuvs3IZ04mzxtnolBmLSsb3ygT97Ka7LTlfOHAwMMuI1y/8BdBxXlgYXlaMZSooHsK 9PH4vDRyYn9XWB2dS6MyWaLr6GqV/3LDyGWHsqRbVQ2XIyMvueV11w0GvPKYOMV1dPt9Nkr31s/b/A Abhyhj4o/zzOIGbIM9AtCj/WOVH+BShXzd8oXO5RcFQOyG8hLfsMDBmxoRoRI4xFX45/OnRFNPFvG0 TWMLF9ftUTjCfAJ6IH3mPkteaw9JFn X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org This patch introduces basic cpufeature support for discovering the presence of the Scalable Matrix Extension. Signed-off-by: Mark Brown --- Documentation/arm64/elf_hwcaps.rst | 33 +++++++++++++++++ arch/arm64/include/asm/cpu.h | 1 + arch/arm64/include/asm/cpufeature.h | 12 ++++++ arch/arm64/include/asm/fpsimd.h | 2 + arch/arm64/include/asm/hwcap.h | 8 ++++ arch/arm64/include/uapi/asm/hwcap.h | 8 ++++ arch/arm64/kernel/cpufeature.c | 57 +++++++++++++++++++++++++++++ arch/arm64/kernel/cpuinfo.c | 9 +++++ arch/arm64/kernel/fpsimd.c | 30 +++++++++++++++ arch/arm64/tools/cpucaps | 2 + 10 files changed, 162 insertions(+) diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst index b72ff17d600a..5626cf208000 100644 --- a/Documentation/arm64/elf_hwcaps.rst +++ b/Documentation/arm64/elf_hwcaps.rst @@ -259,6 +259,39 @@ HWCAP2_RPRES Functionality implied by ID_AA64ISAR2_EL1.RPRES == 0b0001. +HWCAP2_SME + + Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001, as described + by Documentation/arm64/sme.rst. + +HWCAP2_SME_I16I64 + + Functionality implied by ID_AA64SMFR0_EL1.I16I64 == 0b1111. + +HWCAP2_SME_F64F64 + + Functionality implied by ID_AA64SMFR0_EL1.F64F64 == 0b1. + +HWCAP2_SME_I8I32 + + Functionality implied by ID_AA64SMFR0_EL1.I8I32 == 0b1111. + +HWCAP2_SME_F16F32 + + Functionality implied by ID_AA64SMFR0_EL1.F16F32 == 0b1. + +HWCAP2_SME_B16F32 + + Functionality implied by ID_AA64SMFR0_EL1.B16F32 == 0b1. + +HWCAP2_SME_F32F32 + + Functionality implied by ID_AA64SMFR0_EL1.F32F32 == 0b1. + +HWCAP2_SME_FA64 + + Functionality implied by ID_AA64SMFR0_EL1.FA64 == 0b1. + 4. Unused AT_HWCAP bits ----------------------- diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index a58e366f0b07..d08062bcb9c1 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h @@ -58,6 +58,7 @@ struct cpuinfo_arm64 { u64 reg_id_aa64pfr0; u64 reg_id_aa64pfr1; u64 reg_id_aa64zfr0; + u64 reg_id_aa64smfr0; struct cpuinfo_32bit aarch32; diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 2728abd9cae4..f93b1442143f 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -728,6 +728,18 @@ static __always_inline bool system_supports_sve(void) cpus_have_const_cap(ARM64_SVE); } +static __always_inline bool system_supports_sme(void) +{ + return IS_ENABLED(CONFIG_ARM64_SME) && + cpus_have_const_cap(ARM64_SME); +} + +static __always_inline bool system_supports_fa64(void) +{ + return IS_ENABLED(CONFIG_ARM64_SME) && + cpus_have_const_cap(ARM64_SME_FA64); +} + static __always_inline bool system_supports_cnp(void) { return IS_ENABLED(CONFIG_ARM64_CNP) && diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index c90f7f99a768..6b7eb6f2cecd 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -74,6 +74,8 @@ extern void sve_set_vq(unsigned long vq_minus_1); struct arm64_cpu_capabilities; extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused); +extern void sme_kernel_enable(const struct arm64_cpu_capabilities *__unused); +extern void fa64_kernel_enable(const struct arm64_cpu_capabilities *__unused); extern u64 read_zcr_features(void); diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h index f68fbb207473..76d9999527c5 100644 --- a/arch/arm64/include/asm/hwcap.h +++ b/arch/arm64/include/asm/hwcap.h @@ -108,6 +108,14 @@ #define KERNEL_HWCAP_ECV __khwcap2_feature(ECV) #define KERNEL_HWCAP_AFP __khwcap2_feature(AFP) #define KERNEL_HWCAP_RPRES __khwcap2_feature(RPRES) +#define KERNEL_HWCAP_SME __khwcap2_feature(SME) +#define KERNEL_HWCAP_SME_I16I64 __khwcap2_feature(SME_I16I64) +#define KERNEL_HWCAP_SME_F64F64 __khwcap2_feature(SME_F64F64) +#define KERNEL_HWCAP_SME_I8I32 __khwcap2_feature(SME_I8I32) +#define KERNEL_HWCAP_SME_F16F32 __khwcap2_feature(SME_F16F32) +#define KERNEL_HWCAP_SME_B16F32 __khwcap2_feature(SME_B16F32) +#define KERNEL_HWCAP_SME_F32F32 __khwcap2_feature(SME_F32F32) +#define KERNEL_HWCAP_SME_FA64 __khwcap2_feature(SME_FA64) /* * This yields a mask that user programs can use to figure out what diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h index f03731847d9d..60de5626f8fb 100644 --- a/arch/arm64/include/uapi/asm/hwcap.h +++ b/arch/arm64/include/uapi/asm/hwcap.h @@ -78,5 +78,13 @@ #define HWCAP2_ECV (1 << 19) #define HWCAP2_AFP (1 << 20) #define HWCAP2_RPRES (1 << 21) +#define HWCAP2_SME (1 << 22) +#define HWCAP2_SME_I16I64 (1 << 23) +#define HWCAP2_SME_F64F64 (1 << 24) +#define HWCAP2_SME_I8I32 (1 << 25) +#define HWCAP2_SME_F16F32 (1 << 26) +#define HWCAP2_SME_B16F32 (1 << 27) +#define HWCAP2_SME_F32F32 (1 << 28) +#define HWCAP2_SME_FA64 (1 << 29) #endif /* _UAPI__ASM_HWCAP_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index d9f09e40aaf6..3cd7fe53cffa 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -251,6 +251,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { }; static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SME_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MPAMFRAC_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_RASFRAC_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE), @@ -283,6 +284,24 @@ static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = { ARM64_FTR_END, }; +static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = { + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_FA64_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I16I64_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F64F64_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I8I32_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F16F32_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_B16F32_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F32F32_SHIFT, 1, 0), + ARM64_FTR_END, +}; + static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = { ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_ECV_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_FGT_SHIFT, 4, 0), @@ -634,6 +653,7 @@ static const struct __ftr_reg_entry { ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1, &id_aa64pfr1_override), ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0), + ARM64_FTR_REG(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0), /* Op1 = 0, CRn = 0, CRm = 5 */ ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), @@ -947,6 +967,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0); init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1); init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0); + init_cpu_ftr_reg(SYS_ID_AA64SMFR0_EL1, info->reg_id_aa64smfr0); if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) init_32bit_cpu_features(&info->aarch32); @@ -2400,6 +2421,32 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .min_field_value = 1, }, +#ifdef CONFIG_ARM64_SME + { + .desc = "Scalable Matrix Extension", + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .capability = ARM64_SME, + .sys_reg = SYS_ID_AA64PFR1_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64PFR1_SME_SHIFT, + .field_width = 4, + .min_field_value = ID_AA64PFR1_SME, + .matches = has_cpuid_feature, + .cpu_enable = sme_kernel_enable, + }, + { + .desc = "FA64", + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .capability = ARM64_SME_FA64, + .sys_reg = SYS_ID_AA64SMFR0_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64SMFR0_FA64_SHIFT, + .field_width = 1, + .min_field_value = ID_AA64SMFR0_FA64, + .matches = has_cpuid_feature, + .cpu_enable = fa64_kernel_enable, + }, +#endif /* CONFIG_ARM64_SME */ {}, }; @@ -2524,6 +2571,16 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV), HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP), HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES), +#ifdef CONFIG_ARM64_SME + HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SME_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_SME, CAP_HWCAP, KERNEL_HWCAP_SME), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_FA64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_FA64, CAP_HWCAP, KERNEL_HWCAP_SME_FA64), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I16I64_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I16I64, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F64F64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F64F64, CAP_HWCAP, KERNEL_HWCAP_SME_F64F64), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I8I32_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I8I32, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F16F32, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_B16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_B16F32, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32), + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F32F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F32F32, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32), +#endif /* CONFIG_ARM64_SME */ {}, }; diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 591c18a889a5..33ec182e872e 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -97,6 +97,14 @@ static const char *const hwcap_str[] = { [KERNEL_HWCAP_ECV] = "ecv", [KERNEL_HWCAP_AFP] = "afp", [KERNEL_HWCAP_RPRES] = "rpres", + [KERNEL_HWCAP_SME] = "sme", + [KERNEL_HWCAP_SME_I16I64] = "smei16i64", + [KERNEL_HWCAP_SME_F64F64] = "smef64f64", + [KERNEL_HWCAP_SME_I8I32] = "smei8i32", + [KERNEL_HWCAP_SME_F16F32] = "smef16f32", + [KERNEL_HWCAP_SME_B16F32] = "smeb16f32", + [KERNEL_HWCAP_SME_F32F32] = "smef32f32", + [KERNEL_HWCAP_SME_FA64] = "smefa64", }; #ifdef CONFIG_COMPAT @@ -400,6 +408,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1); info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1); info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1); + info->reg_id_aa64smfr0 = read_cpuid(ID_AA64SMFR0_EL1); if (id_aa64pfr1_mte(info->reg_id_aa64pfr1)) info->reg_gmid = read_cpuid(GMID_EL1); diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 5280e098cfb5..576490be3c2b 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -987,6 +987,32 @@ void fpsimd_release_task(struct task_struct *dead_task) #endif /* CONFIG_ARM64_SVE */ +#ifdef CONFIG_ARM64_SME + +void sme_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p) +{ + /* Set priority for all PEs to architecturally defined minimum */ + write_sysreg_s(read_sysreg_s(SYS_SMPRI_EL1) & ~SMPRI_EL1_PRIORITY_MASK, + SYS_SMPRI_EL1); + + /* Allow SME in kernel */ + write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_SMEN_EL1EN, CPACR_EL1); + isb(); +} + +/* + * This must be called after sme_kernel_enable(), we rely on the + * feature table being sorted to ensure this. + */ +void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p) +{ + /* Allow use of FA64 */ + write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_FA64_MASK, + SYS_SMCR_EL1); +} + +#endif /* CONFIG_ARM64_SVE */ + /* * Trapped SVE access * @@ -1532,6 +1558,10 @@ static int __init fpsimd_init(void) if (!cpu_have_named_feature(ASIMD)) pr_notice("Advanced SIMD is not implemented\n"); + + if (cpu_have_named_feature(SME) && !cpu_have_named_feature(SVE)) + pr_notice("SME is implemented but not SVE\n"); + return sve_sysctl_init(); } core_initcall(fpsimd_init); diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 870c39537dd0..58dfc8547c64 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -41,6 +41,8 @@ KVM_PROTECTED_MODE MISMATCHED_CACHE_TYPE MTE MTE_ASYMM +SME +SME_FA64 SPECTRE_V2 SPECTRE_V3A SPECTRE_V4 From patchwork Wed Jan 26 15:10:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 593DAC28CF5 for ; Wed, 26 Jan 2022 15:12:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235802AbiAZPMT (ORCPT ); Wed, 26 Jan 2022 10:12:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPMS (ORCPT ); Wed, 26 Jan 2022 10:12:18 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC17DC06161C for ; Wed, 26 Jan 2022 07:12:17 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 69B78B81C0F for ; Wed, 26 Jan 2022 15:12:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F25E4C340E8; Wed, 26 Jan 2022 15:12:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209935; bh=iVdj3mxas6xH+1iRC6maiXHTY83c1rk4ADYLHk9Ya1g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SKGIB0+B70S87ghwsbcZw6M4QGuMgIsSPkwb4mKv2U/2xE6hntMckgq7gwdnff/lq rMjfwsHNQm1oAiJS9tCSZzhBL2yvefqbxHhFcn4EBrhjW82IC6OBeWdicAtm1YTXE0 E11wv69kzYtyEPtfmLvegr3J1eSRE/A6v/qgvr3F14d3VAWsB5nsi/vupSGIi/2/IO 3CFPzrhe1k+BTtE912z6eJH7+vbZiJCpYH6NSoVR+sDCwgnBaPHVq1fFXwrXoyxfu6 C9/1UhrC7/5VU7kL8u4qHx160Ec+El7nnbSHAEyNem6I+NUwqwHpySisGPdzoiJrZS XyPJ03FK99N4w== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 10/40] arm64/sme: Identify supported SME vector lengths at boot Date: Wed, 26 Jan 2022 15:10:50 +0000 Message-Id: <20220126151120.3811248-11-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=13528; h=from:subject; bh=iVdj3mxas6xH+1iRC6maiXHTY83c1rk4ADYLHk9Ya1g=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR+vDf9SkR+8G7PVz4qDYk9qdqPXmwn/LtoUmL+ e06ZyH2JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkfgAKCRAk1otyXVSH0OnDB/ 9TyV1zpJxDOH5P8q54E7cliw0L/FquXBRdyREzz+yO1jG0+cOu4EE0THWU3KwrSwkBFPnlGk5YV9BD cbm4HPXq+F4CJIifauGsc3/vd2jjMPj2CI5SCNzbrOJH9ejT1KJqU/xxvZeskpLG8ec5oAd92e9ffs nQhpX/8QdaTz4SIRj0KBziETe4EwJUqdFyxKf9OdO1AWxLZIW7K8gMtKJQC1b8XqZqq0H1dVyH1D+P diJnf9xr4k8pMcMDAmGhyEheILfgIZmb7/Ih0dH4fz9W1oDkZC2cnAGGq4mtRue+qNJASAvV35Sipp 8Z0yppNphCu7zH//jLCCEfNkT0RQZl X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The vector lengths used for SME are controlled through a similar set of registers to those for SVE and enumerated using a similar algorithm with some slight differences due to the fact that unlike SVE there are no restrictions on which combinations of vector lengths can be supported nor any mandatory vector lengths which must be implemented. Add a new vector type and implement support for enumerating it. One slightly awkward feature is that we need to read the current vector length using a different instruction (or enter streaming mode which would have the same issue and be higher cost). Rather than add an ops structure we add special cases directly in the otherwise generic vec_probe_vqs() function, this is a bit inelegant but it's the only place where this is an issue. Signed-off-by: Mark Brown --- arch/arm64/include/asm/cpu.h | 3 + arch/arm64/include/asm/cpufeature.h | 7 ++ arch/arm64/include/asm/fpsimd.h | 26 ++++++ arch/arm64/include/asm/processor.h | 1 + arch/arm64/kernel/cpufeature.c | 49 +++++++++++ arch/arm64/kernel/cpuinfo.c | 4 + arch/arm64/kernel/entry-fpsimd.S | 9 ++ arch/arm64/kernel/fpsimd.c | 123 +++++++++++++++++++++++++++- 8 files changed, 220 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index d08062bcb9c1..550e1fc4ae6c 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h @@ -64,6 +64,9 @@ struct cpuinfo_arm64 { /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */ u64 reg_zcr; + + /* pseudo-SMCR for recording maximum ZCR_EL1 LEN value: */ + u64 reg_smcr; }; DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data); diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index f93b1442143f..9d36035acce3 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -620,6 +620,13 @@ static inline bool id_aa64pfr0_sve(u64 pfr0) return val > 0; } +static inline bool id_aa64pfr1_sme(u64 pfr1) +{ + u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_SME_SHIFT); + + return val > 0; +} + static inline bool id_aa64pfr1_mte(u64 pfr1) { u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_MTE_SHIFT); diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 6b7eb6f2cecd..8cb99b6fe565 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -78,6 +78,7 @@ extern void sme_kernel_enable(const struct arm64_cpu_capabilities *__unused); extern void fa64_kernel_enable(const struct arm64_cpu_capabilities *__unused); extern u64 read_zcr_features(void); +extern u64 read_smcr_features(void); /* * Helpers to translate bit indices in sve_vq_map to VQ values (and @@ -172,6 +173,12 @@ static inline void write_vl(enum vec_type type, u64 val) tmp = read_sysreg_s(SYS_ZCR_EL1) & ~ZCR_ELx_LEN_MASK; write_sysreg_s(tmp | val, SYS_ZCR_EL1); break; +#endif +#ifdef CONFIG_ARM64_SME + case ARM64_VEC_SME: + tmp = read_sysreg_s(SYS_SMCR_EL1) & ~SMCR_ELx_LEN_MASK; + write_sysreg_s(tmp | val, SYS_SMCR_EL1); + break; #endif default: WARN_ON_ONCE(1); @@ -268,12 +275,31 @@ static inline void sme_smstop(void) asm volatile(".inst 0xd503467f"); } +extern void __init sme_setup(void); + +static inline int sme_max_vl(void) +{ + return vec_max_vl(ARM64_VEC_SME); +} + +static inline int sme_max_virtualisable_vl(void) +{ + return vec_max_virtualisable_vl(ARM64_VEC_SME); +} + +extern unsigned int sme_get_vl(void); + #else static inline void sme_smstart_sm(void) { } static inline void sme_smstop_sm(void) { } static inline void sme_smstop(void) { } +static inline void sme_setup(void) { } +static inline unsigned int sme_get_vl(void) { return 0; } +static inline int sme_max_vl(void) { return 0; } +static inline int sme_max_virtualisable_vl(void) { return 0; } + #endif /* ! CONFIG_ARM64_SME */ /* For use by EFI runtime services calls only */ diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 6f41b65f9962..df68f0fb0ded 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -117,6 +117,7 @@ struct debug_info { enum vec_type { ARM64_VEC_SVE = 0, + ARM64_VEC_SME, ARM64_VEC_MAX, }; diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 3cd7fe53cffa..1995ff72a8c6 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -570,6 +570,12 @@ static const struct arm64_ftr_bits ftr_zcr[] = { ARM64_FTR_END, }; +static const struct arm64_ftr_bits ftr_smcr[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, + SMCR_ELx_LEN_SHIFT, SMCR_ELx_LEN_SIZE, 0), /* LEN */ + ARM64_FTR_END, +}; + /* * Common ftr bits for a 32bit register with all hidden, strict * attributes, with 4bit feature fields and a default safe value of @@ -673,6 +679,7 @@ static const struct __ftr_reg_entry { /* Op1 = 0, CRn = 1, CRm = 2 */ ARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr), + ARM64_FTR_REG(SYS_SMCR_EL1, ftr_smcr), /* Op1 = 1, CRn = 0, CRm = 0 */ ARM64_FTR_REG(SYS_GMID_EL1, ftr_gmid), @@ -977,6 +984,14 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) vec_init_vq_map(ARM64_VEC_SVE); } + if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) { + init_cpu_ftr_reg(SYS_SMCR_EL1, info->reg_smcr); + if (IS_ENABLED(CONFIG_ARM64_SME)) { + sme_kernel_enable(NULL); + vec_init_vq_map(ARM64_VEC_SME); + } + } + if (id_aa64pfr1_mte(info->reg_id_aa64pfr1)) init_cpu_ftr_reg(SYS_GMID_EL1, info->reg_gmid); @@ -1203,6 +1218,9 @@ void update_cpu_features(int cpu, taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu, info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0); + taint |= check_update_ftr_reg(SYS_ID_AA64SMFR0_EL1, cpu, + info->reg_id_aa64smfr0, boot->reg_id_aa64smfr0); + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu, info->reg_zcr, boot->reg_zcr); @@ -1213,6 +1231,16 @@ void update_cpu_features(int cpu, vec_update_vq_map(ARM64_VEC_SVE); } + if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) { + taint |= check_update_ftr_reg(SYS_SMCR_EL1, cpu, + info->reg_smcr, boot->reg_smcr); + + /* Probe vector lengths, unless we already gave up on SME */ + if (id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1)) && + !system_capabilities_finalized()) + vec_update_vq_map(ARM64_VEC_SME); + } + /* * The kernel uses the LDGM/STGM instructions and the number of tags * they read/write depends on the GMID_EL1.BS field. Check that the @@ -2878,6 +2906,23 @@ static void verify_sve_features(void) /* Add checks on other ZCR bits here if necessary */ } +static void verify_sme_features(void) +{ + u64 safe_smcr = read_sanitised_ftr_reg(SYS_SMCR_EL1); + u64 smcr = read_smcr_features(); + + unsigned int safe_len = safe_smcr & SMCR_ELx_LEN_MASK; + unsigned int len = smcr & SMCR_ELx_LEN_MASK; + + if (len < safe_len || vec_verify_vq_map(ARM64_VEC_SME)) { + pr_crit("CPU%d: SME: vector length support mismatch\n", + smp_processor_id()); + cpu_die_early(); + } + + /* Add checks on other SMCR bits here if necessary */ +} + static void verify_hyp_capabilities(void) { u64 safe_mmfr1, mmfr0, mmfr1; @@ -2930,6 +2975,9 @@ static void verify_local_cpu_capabilities(void) if (system_supports_sve()) verify_sve_features(); + if (system_supports_sme()) + verify_sme_features(); + if (is_hyp_mode_available()) verify_hyp_capabilities(); } @@ -3047,6 +3095,7 @@ void __init setup_cpu_features(void) pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n"); sve_setup(); + sme_setup(); minsigstksz_setup(); /* Advertise that we have computed the system capabilities */ diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 33ec182e872e..72dc155719b0 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -420,6 +420,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) id_aa64pfr0_sve(info->reg_id_aa64pfr0)) info->reg_zcr = read_zcr_features(); + if (IS_ENABLED(CONFIG_ARM64_SME) && + id_aa64pfr1_sme(info->reg_id_aa64pfr1)) + info->reg_smcr = read_smcr_features(); + cpuinfo_detect_icache_policy(info); } diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index dc242e269f9a..deee5f01462e 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -86,3 +86,12 @@ SYM_FUNC_START(sve_flush_live) SYM_FUNC_END(sve_flush_live) #endif /* CONFIG_ARM64_SVE */ + +#ifdef CONFIG_ARM64_SME + +SYM_FUNC_START(sme_get_vl) + _sme_rdsvl 0, 1 + ret +SYM_FUNC_END(sme_get_vl) + +#endif /* CONFIG_ARM64_SME */ diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 576490be3c2b..3fb2167f8af7 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -136,6 +136,12 @@ __ro_after_init struct vl_info vl_info[ARM64_VEC_MAX] = { .max_virtualisable_vl = SVE_VL_MIN, }, #endif +#ifdef CONFIG_ARM64_SME + [ARM64_VEC_SME] = { + .type = ARM64_VEC_SME, + .name = "SME", + }, +#endif }; static unsigned int vec_vl_inherit_flag(enum vec_type type) @@ -186,6 +192,20 @@ extern void __percpu *efi_sve_state; #endif /* ! CONFIG_ARM64_SVE */ +#ifdef CONFIG_ARM64_SME + +static int get_sme_default_vl(void) +{ + return get_default_vl(ARM64_VEC_SME); +} + +static void set_sme_default_vl(int val) +{ + set_default_vl(ARM64_VEC_SME, val); +} + +#endif + DEFINE_PER_CPU(bool, fpsimd_context_busy); EXPORT_PER_CPU_SYMBOL(fpsimd_context_busy); @@ -403,6 +423,8 @@ static unsigned int find_supported_vector_length(enum vec_type type, if (vl > max_vl) vl = max_vl; + if (vl < info->min_vl) + vl = info->min_vl; bit = find_next_bit(info->vq_map, SVE_VQ_MAX, __vq_to_bit(sve_vq_from_vl(vl))); @@ -764,7 +786,23 @@ static void vec_probe_vqs(struct vl_info *info, for (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) { write_vl(info->type, vq - 1); /* self-syncing */ - vl = sve_get_vl(); + + switch (info->type) { + case ARM64_VEC_SVE: + vl = sve_get_vl(); + break; + case ARM64_VEC_SME: + vl = sme_get_vl(); + break; + default: + vl = 0; + break; + } + + /* Minimum VL identified? */ + if (sve_vq_from_vl(vl) > vq) + break; + vq = sve_vq_from_vl(vl); /* skip intervening lengths */ set_bit(__vq_to_bit(vq), map); } @@ -1011,7 +1049,88 @@ void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p) SYS_SMCR_EL1); } -#endif /* CONFIG_ARM64_SVE */ +/* + * Read the pseudo-SMCR used by cpufeatures to identify the supported + * vector length. + * + * Use only if SME is present. + * This function clobbers the SME vector length. + */ +u64 read_smcr_features(void) +{ + u64 smcr; + unsigned int vq_max; + + sme_kernel_enable(NULL); + sme_smstart_sm(); + + /* + * Set the maximum possible VL. + */ + write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_LEN_MASK, + SYS_SMCR_EL1); + + smcr = read_sysreg_s(SYS_SMCR_EL1); + smcr &= ~(u64)SMCR_ELx_LEN_MASK; /* Only the LEN field */ + vq_max = sve_vq_from_vl(sve_get_vl()); + smcr |= vq_max - 1; /* set LEN field to maximum effective value */ + + sme_smstop_sm(); + + return smcr; +} + +void __init sme_setup(void) +{ + struct vl_info *info = &vl_info[ARM64_VEC_SME]; + u64 smcr; + int min_bit; + + if (!system_supports_sme()) + return; + + /* + * SME doesn't require any particular vector length be + * supported but it does require at least one. We should have + * disabled the feature entirely while bringing up CPUs but + * let's double check here. + */ + WARN_ON(bitmap_empty(info->vq_map, SVE_VQ_MAX)); + + min_bit = find_last_bit(info->vq_map, SVE_VQ_MAX); + info->min_vl = sve_vl_from_vq(__bit_to_vq(min_bit)); + + smcr = read_sanitised_ftr_reg(SYS_SMCR_EL1); + info->max_vl = sve_vl_from_vq((smcr & SMCR_ELx_LEN_MASK) + 1); + + /* + * Sanity-check that the max VL we determined through CPU features + * corresponds properly to sme_vq_map. If not, do our best: + */ + if (WARN_ON(info->max_vl != find_supported_vector_length(ARM64_VEC_SME, + info->max_vl))) + info->max_vl = find_supported_vector_length(ARM64_VEC_SME, + info->max_vl); + + WARN_ON(info->min_vl > info->max_vl); + + /* + * For the default VL, pick the maximum supported value <= 32 + * (256 bits) if there is one since this is guaranteed not to + * grow the signal frame when in streaming mode, otherwise the + * minimum available VL will be used. + */ + set_sme_default_vl(find_supported_vector_length(ARM64_VEC_SME, 32)); + + pr_info("SME: minimum available vector length %u bytes per vector\n", + info->min_vl); + pr_info("SME: maximum available vector length %u bytes per vector\n", + info->max_vl); + pr_info("SME: default vector length %u bytes per vector\n", + get_sme_default_vl()); +} + +#endif /* CONFIG_ARM64_SME */ /* * Trapped SVE access From patchwork Wed Jan 26 15:10:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537368 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D23CC3526D for ; Wed, 26 Jan 2022 15:12:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242597AbiAZPMU (ORCPT ); Wed, 26 Jan 2022 10:12:20 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:40852 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPMU (ORCPT ); Wed, 26 Jan 2022 10:12:20 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AE5D7617B4 for ; Wed, 26 Jan 2022 15:12:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AF0E7C340E6; Wed, 26 Jan 2022 15:12:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209939; bh=IpbUE1LUP1+wV97Or602iA0h2YoCGBZmK9vzxYxrIfM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HRgSKsKzQ/GvYKHSUl+eMjIhEi5HhyZlfVP0dvFCmKMZYY5KWKuMxsqlPXu7o10nU FqXdLPIPT/3Q0BvynbAurpk8l+q2qAoPIqFb/CaS8V7NTFoX7oPgpFtKoUiidxj0C/ jtT6KtcvTNYLqvzTDZawATUH48zhASs2F3A0g/5wqHHC2n3GA8DHOd4AXF1+3KQCwn 4nrFJ1959tpE16g3mcI1eBx5kVymTZfiw8yQcwb0+D1bmCF7wlGUq4FtoMBRN6Srpw P8TbRfTXrJIPWxvZu2ZmIJi+EEGuuCcceUAPvbA6E6gewj8HUMafmZFmne5gwuYjMZ Hp8dsUHR8pYHg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 11/40] arm64/sme: Implement sysctl to set the default vector length Date: Wed, 26 Jan 2022 15:10:51 +0000 Message-Id: <20220126151120.3811248-12-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1673; h=from:subject; bh=IpbUE1LUP1+wV97Or602iA0h2YoCGBZmK9vzxYxrIfM=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR/FgJn/lc8qyFbpsUtXlg3lgi4N72n7+sKm+k1 dP+cSUGJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkfwAKCRAk1otyXVSH0EW8B/ 9ttjf4/ojwSYQeBnvjKjyVqvnAoeKb9lBx/meHIflTtTt3sV3vd2uPBQpWKID+SolrOGttvSd1QzEA pvOgliHnu6mIPLLJ9QlXhJkW0bY+/oAT25fjJKyVMmmk2HBeFUwroryh2RKrJ7cKabSWVeC7omebSc kdXrD7xAVuViY6SBHU+lCEwrqp41vZAtpHD77bGL07Ozlr/ktZVVs1T0xw64AbxbWzFcdj0K+dp+ya n1wkoPh+uUvsSuRqKfoCifvOuIqjR/rDaDSsV+XK8uH0BRDuRlC0leJVkrwVaTIyBcezqxbaLv5blf LqRWrOAuS+tBwZk4VY3TrN7gL+bZ0r X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org As for SVE provide a sysctl which allows the default SME vector length to be configured. Signed-off-by: Mark Brown --- arch/arm64/kernel/fpsimd.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 3fb2167f8af7..841b9609342f 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -483,6 +483,30 @@ static int __init sve_sysctl_init(void) static int __init sve_sysctl_init(void) { return 0; } #endif /* ! (CONFIG_ARM64_SVE && CONFIG_SYSCTL) */ +#if defined(CONFIG_ARM64_SME) && defined(CONFIG_SYSCTL) +static struct ctl_table sme_default_vl_table[] = { + { + .procname = "sme_default_vector_length", + .mode = 0644, + .proc_handler = vec_proc_do_default_vl, + .extra1 = &vl_info[ARM64_VEC_SME], + }, + { } +}; + +static int __init sme_sysctl_init(void) +{ + if (system_supports_sme()) + if (!register_sysctl("abi", sme_default_vl_table)) + return -EINVAL; + + return 0; +} + +#else /* ! (CONFIG_ARM64_SME && CONFIG_SYSCTL) */ +static int __init sme_sysctl_init(void) { return 0; } +#endif /* ! (CONFIG_ARM64_SME && CONFIG_SYSCTL) */ + #define ZREG(sve_state, vq, n) ((char *)(sve_state) + \ (SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET)) @@ -1681,6 +1705,9 @@ static int __init fpsimd_init(void) if (cpu_have_named_feature(SME) && !cpu_have_named_feature(SVE)) pr_notice("SME is implemented but not SVE\n"); - return sve_sysctl_init(); + sve_sysctl_init(); + sme_sysctl_init(); + + return 0; } core_initcall(fpsimd_init); From patchwork Wed Jan 26 15:10:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D41BDC2BA4C for ; Wed, 26 Jan 2022 15:12:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235760AbiAZPMY (ORCPT ); Wed, 26 Jan 2022 10:12:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPMY (ORCPT ); Wed, 26 Jan 2022 10:12:24 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02241C06161C for ; Wed, 26 Jan 2022 07:12:24 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 954B161463 for ; Wed, 26 Jan 2022 15:12:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 93969C340EB; Wed, 26 Jan 2022 15:12:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209943; bh=B/urqBimz+3K4ZCFBMO8EnKA3FRn7rrK3YpjvULX0fM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jWcNrcgmr7CmaLSM0S4z6l8fwh4Ro7ZK0enW2VCv9R2QeSK2RVB5CjHz0ys1hs/iT Cjh0TiBRsMHxVIBD9n88foEi2jRGnBa8idtvPvIBu/0RsZLIKFspQJmuVXVigLhn+t QhacGlZxWNDXUIFlH7X4FCMyxTRVpxG+ahbgalVSyVdQ3aFFpkV1y/9GMdTIO6PY2R bLs1ujQ83W8wCijnczVWKF6CEQOjSH6AFo0TZtMloaOdWt4fAvVGdkmnbyagx42bfd xlSv8FEpRm+yGsnxWbqdEt6tZvwbJeKEr/NKQTEyftIJmFkLzsa4p8CPKDb2bjFTpA BLO4jS4ZiI5xA== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 12/40] arm64/sme: Implement vector length configuration prctl()s Date: Wed, 26 Jan 2022 15:10:52 +0000 Message-Id: <20220126151120.3811248-13-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5677; h=from:subject; bh=B/urqBimz+3K4ZCFBMO8EnKA3FRn7rrK3YpjvULX0fM=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WR/2XcqhBFxXZp1c4U9xXBU+g+VgJg0PVXp8ovU QJ6u6QyJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkfwAKCRAk1otyXVSH0AmaB/ 4xJWM56vCLc770WuODZ7GVzfD40R+N82ofMnh418pKhdAz1JW9nnvDYp4wzJ8saHzb+V4FKGm5HLXj mLjCgff75kHRECivjRhbdTXlGoyYRGu84sOrsvK6g0/FU8T4i4SJpW820BQFYU47mvXAnITH98qkmX REn9atM9hUQFk6wWuiEg6c1HUhxazm/pfyQVmt4k9vpWGZKbhgS1utrPqFK3PJRx7Lmr3TS8sYimRs sc0i0mkRFifk+jwNJ5lkFaM5+N+4Y6OH8iIWcoSSyOjujzyU8zUT+LEzHgD5zqn3DNWGULeU7mCF81 EeI9TmRFFEvW4LGbbGCcx0CeNolRKw X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org As for SVE provide a prctl() interface which allows processes to configure their SME vector length. Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 4 ++++ arch/arm64/include/asm/processor.h | 4 +++- arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/fpsimd.c | 32 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 9 ++++++++ kernel/sys.c | 12 +++++++++++ 6 files changed, 61 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 8cb99b6fe565..babf944e7c0c 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -288,6 +288,8 @@ static inline int sme_max_virtualisable_vl(void) } extern unsigned int sme_get_vl(void); +extern int sme_set_current_vl(unsigned long arg); +extern int sme_get_current_vl(void); #else @@ -299,6 +301,8 @@ static inline void sme_setup(void) { } static inline unsigned int sme_get_vl(void) { return 0; } static inline int sme_max_vl(void) { return 0; } static inline int sme_max_virtualisable_vl(void) { return 0; } +static inline int sme_set_current_vl(unsigned long arg) { return -EINVAL; } +static inline int sme_get_current_vl(void) { return -EINVAL; } #endif /* ! CONFIG_ARM64_SME */ diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index df68f0fb0ded..f2c2ebd440e2 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -354,9 +354,11 @@ extern void __init minsigstksz_setup(void); */ #include -/* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */ +/* Userspace interface for PR_S[MV]E_{SET,GET}_VL prctl()s: */ #define SVE_SET_VL(arg) sve_set_current_vl(arg) #define SVE_GET_VL() sve_get_current_vl() +#define SME_SET_VL(arg) sme_set_current_vl(arg) +#define SME_GET_VL() sme_get_current_vl() /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg) diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index e1317b7c4525..4e6b58dcd6f9 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -82,6 +82,7 @@ int arch_dup_task_struct(struct task_struct *dst, #define TIF_SVE_VL_INHERIT 24 /* Inherit SVE vl_onexec across exec */ #define TIF_SSBD 25 /* Wants SSB mitigation */ #define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */ +#define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 841b9609342f..1edb3996f9cf 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -149,6 +149,8 @@ static unsigned int vec_vl_inherit_flag(enum vec_type type) switch (type) { case ARM64_VEC_SVE: return TIF_SVE_VL_INHERIT; + case ARM64_VEC_SME: + return TIF_SME_VL_INHERIT; default: WARN_ON_ONCE(1); return 0; @@ -801,6 +803,36 @@ int sve_get_current_vl(void) return vec_prctl_status(ARM64_VEC_SVE, 0); } +#ifdef CONFIG_ARM64_SME +/* PR_SME_SET_VL */ +int sme_set_current_vl(unsigned long arg) +{ + unsigned long vl, flags; + int ret; + + vl = arg & PR_SME_VL_LEN_MASK; + flags = arg & ~vl; + + if (!system_supports_sme() || is_compat_task()) + return -EINVAL; + + ret = vec_set_vector_length(current, ARM64_VEC_SME, vl, flags); + if (ret) + return ret; + + return vec_prctl_status(ARM64_VEC_SME, flags); +} + +/* PR_SME_GET_VL */ +int sme_get_current_vl(void) +{ + if (!system_supports_sme() || is_compat_task()) + return -EINVAL; + + return vec_prctl_status(ARM64_VEC_SME, 0); +} +#endif /* CONFIG_ARM64_SME */ + static void vec_probe_vqs(struct vl_info *info, DECLARE_BITMAP(map, SVE_VQ_MAX)) { diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index e998764f0262..a5e06dcbba13 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -272,6 +272,15 @@ struct prctl_mm_map { # define PR_SCHED_CORE_SCOPE_THREAD_GROUP 1 # define PR_SCHED_CORE_SCOPE_PROCESS_GROUP 2 +/* arm64 Scalable Matrix Extension controls */ +/* Flag values must be in sync with SVE versions */ +#define PR_SME_SET_VL 63 /* set task vector length */ +# define PR_SME_SET_VL_ONEXEC (1 << 18) /* defer effect until exec */ +#define PR_SME_GET_VL 64 /* get task vector length */ +/* Bits common to PR_SME_SET_VL and PR_SME_GET_VL */ +# define PR_SME_VL_LEN_MASK 0xffff +# define PR_SME_VL_INHERIT (1 << 17) /* inherit across exec */ + #define PR_SET_VMA 0x53564d41 # define PR_SET_VMA_ANON_NAME 0 diff --git a/kernel/sys.c b/kernel/sys.c index ecc4cf019242..5cd4fffc7cd0 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -116,6 +116,12 @@ #ifndef SVE_GET_VL # define SVE_GET_VL() (-EINVAL) #endif +#ifndef SME_SET_VL +# define SME_SET_VL(a) (-EINVAL) +#endif +#ifndef SME_GET_VL +# define SME_GET_VL() (-EINVAL) +#endif #ifndef PAC_RESET_KEYS # define PAC_RESET_KEYS(a, b) (-EINVAL) #endif @@ -2523,6 +2529,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_SVE_GET_VL: error = SVE_GET_VL(); break; + case PR_SME_SET_VL: + error = SME_SET_VL(arg2); + break; + case PR_SME_GET_VL: + error = SME_GET_VL(); + break; case PR_GET_SPECULATION_CTRL: if (arg3 || arg4 || arg5) return -EINVAL; From patchwork Wed Jan 26 15:10:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E182C28CF5 for ; Wed, 26 Jan 2022 15:12:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242593AbiAZPMa (ORCPT ); Wed, 26 Jan 2022 10:12:30 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:41188 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPM3 (ORCPT ); Wed, 26 Jan 2022 10:12:29 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 110A7B81E3A for ; Wed, 26 Jan 2022 15:12:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7780EC36AE7; Wed, 26 Jan 2022 15:12:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209946; bh=c5P3WzFe/myKJt98GT0tqU73tZQT7e7FvegzOnxWfIU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AxGW1Uvcej3F2oq7336XWMsNq/65k7M8C+VdRxntKWFkdl1yVJCg1NNDWqV9plRFN En6Qct/NSQpwqLtrKh+ub3UPlRaxopJE66cWykSc48x8m2PEu4Wdz/IBYZ1Iuh9/to eqwNVaLozRDXDl17fgu8rieFy5WwVf6Le1FfwfN6qd0T8T9322GGOy2kRNmq/De0sI VU0t+lLK+9h8uo6TJ+TIfox+t3QWXK1Ue4YpxEQrpb5s9rIT0B4OVy4ijMA/fmVrfH oNXbn9cplJsAn3xuF0tY/U8eVWiu2105znOSGmDHko4u1rm3/Sb7dPgwuBUs/ew9ca iaFWHMhyyQUNw== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 13/40] arm64/sme: Implement support for TPIDR2 Date: Wed, 26 Jan 2022 15:10:53 +0000 Message-Id: <20220126151120.3811248-14-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4258; h=from:subject; bh=c5P3WzFe/myKJt98GT0tqU73tZQT7e7FvegzOnxWfIU=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSAq7eZ7r91LceN+pOy5udpzbiaxzgFRcvUb03E Y2wAT9eJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkgAAKCRAk1otyXVSH0NoYCA CC7nz2X8fHrfM1HfxD5arqNgZapxWyyXgBK1pJHpIF2nUHi+7j/UnZQYa6tIaMeNQjgZSTkzSI+IaF RXD/MILwjYH2tHHos5KPeclwj1SIbgqr8paJecWnjhsrtIoEyAm3fBi9uG/U+Xn0M/Dpvq3WJraNE/ Bp9/63GldS9rE3AtBjm8rXT+TK1ixhrkWMnmVXtx1XRbHeSzHVv0tUpPqlL0xV9+jndmtOQmw2Dia0 N6hhlQSX6SafRhxaf6brxIBQMojZ182Z0HTkmzmnRFyczbXLi9s+kJG6UznDfMwBY9SaF4BfKLNYCI +xMX2t2wzG6CNjroOAD0NEUD+VvJJR X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The Scalable Matrix Extension introduces support for a new thread specific data register TPIDR2 intended for use by libc. The kernel must save the value of TPIDR2 on context switch and should ensure that all new threads start off with a default value of 0. Add a field to the thread_struct to store TPIDR2 and context switch it with the other thread specific data. In case there are future extensions which also use TPIDR2 we introduce system_supports_tpidr2() and use that rather than system_supports_sme() for TPIDR2 handling. Signed-off-by: Mark Brown --- arch/arm64/include/asm/cpufeature.h | 5 +++++ arch/arm64/include/asm/processor.h | 1 + arch/arm64/kernel/fpsimd.c | 4 ++++ arch/arm64/kernel/process.c | 14 ++++++++++++-- 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 9d36035acce3..12252352d2ee 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -747,6 +747,11 @@ static __always_inline bool system_supports_fa64(void) cpus_have_const_cap(ARM64_SME_FA64); } +static __always_inline bool system_supports_tpidr2(void) +{ + return system_supports_sme(); +} + static __always_inline bool system_supports_cnp(void) { return IS_ENABLED(CONFIG_ARM64_CNP) && diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index f2c2ebd440e2..008a1767ebff 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -168,6 +168,7 @@ struct thread_struct { u64 mte_ctrl; #endif u64 sctlr_user; + u64 tpidr2_el0; }; static inline unsigned int thread_get_vl(struct thread_struct *thread, diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 1edb3996f9cf..40ef89120774 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1092,6 +1092,10 @@ void sme_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p) /* Allow SME in kernel */ write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_SMEN_EL1EN, CPACR_EL1); isb(); + + /* Allow EL0 to access TPIDR2 */ + write_sysreg(read_sysreg(SCTLR_EL1) | SCTLR_ELx_ENTP2, SCTLR_EL1); + isb(); } /* diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 5369e649fa79..e69a3dcdb0d9 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -250,6 +250,8 @@ void show_regs(struct pt_regs *regs) static void tls_thread_flush(void) { write_sysreg(0, tpidr_el0); + if (system_supports_tpidr2()) + write_sysreg_s(0, SYS_TPIDR2_EL0); if (is_compat_task()) { current->thread.uw.tp_value = 0; @@ -343,6 +345,8 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start, * out-of-sync with the saved value. */ *task_user_tls(p) = read_sysreg(tpidr_el0); + if (system_supports_tpidr2()) + p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); if (stack_start) { if (is_compat_thread(task_thread_info(p))) @@ -353,10 +357,12 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start, /* * If a TLS pointer was passed to clone, use it for the new - * thread. + * thread. We also reset TPIDR2 if it's in use. */ - if (clone_flags & CLONE_SETTLS) + if (clone_flags & CLONE_SETTLS) { p->thread.uw.tp_value = tls; + p->thread.tpidr2_el0 = 0; + } } else { /* * A kthread has no context to ERET to, so ensure any buggy @@ -387,6 +393,8 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start, void tls_preserve_current_state(void) { *task_user_tls(current) = read_sysreg(tpidr_el0); + if (system_supports_tpidr2() && !is_compat_task()) + current->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); } static void tls_thread_switch(struct task_struct *next) @@ -399,6 +407,8 @@ static void tls_thread_switch(struct task_struct *next) write_sysreg(0, tpidrro_el0); write_sysreg(*task_user_tls(next), tpidr_el0); + if (system_supports_tpidr2()) + write_sysreg_s(next->thread.tpidr2_el0, SYS_TPIDR2_EL0); } /* From patchwork Wed Jan 26 15:10:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E53AC63684 for ; Wed, 26 Jan 2022 15:12:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242598AbiAZPMc (ORCPT ); Wed, 26 Jan 2022 10:12:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPMb (ORCPT ); Wed, 26 Jan 2022 10:12:31 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3C6FC06161C for ; Wed, 26 Jan 2022 07:12:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5632C61842 for ; Wed, 26 Jan 2022 15:12:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 56875C36AE5; Wed, 26 Jan 2022 15:12:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209950; bh=3ak/TkOEva2A1EWWMZ6rEAo+xuM7bHsmgGI8j+vb0mA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ilmv31MYAj37zUV8Cap4i3AmyUQP9Xohmcf9+06VFwI9RkKBTdbHMKs4wNk1NSd4k sM6X/x5T//3ePbKx5IzVEX4m+am7fR5jqKGt1f8raUW60ZmiycgzP7HYiHrY8Oqmx9 VlyW9UbRN5OsyP5AYptFCd6su8SJ5QBxppRohqaC2YRTNPloJ40fwHV8pyFzpv8bfy zwM7l7Kdjs5lnM0pMHrqbACo9+my2nmMVRwo+pCuIMSyqWR8D/REO6IBmygG1YLrwG L3jzliIDgzw1OwjYIZ8l7GNIkv6vHiq250U5mS33pol1d7QltFkpmF/pTAMhsiUVSf UYf+9X0weTLdg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 14/40] arm64/sme: Implement SVCR context switching Date: Wed, 26 Jan 2022 15:10:54 +0000 Message-Id: <20220126151120.3811248-15-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6142; h=from:subject; bh=3ak/TkOEva2A1EWWMZ6rEAo+xuM7bHsmgGI8j+vb0mA=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSBxHICk/aRefIWC+RRWEwIqJy9Ttg8s+gBA7Ye 7XUSftuJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkgQAKCRAk1otyXVSH0J04B/ 45bSUx+fDZN41Pz+nVxj/3A8ENkn+ght7A+wYN6dQ+zmRXtuktYM5ZEg2MFUT2mHcB1qzx70yxJ0LA aLRmO18S5PWhERXcqapr8Y4DWoI1QSIV+0tFs9cOFxmgmxpqzAM8ES2RDISUdLDB+q+LCck6YXVG89 cR08Q1VhQKYQ3Mm2iHlCkoAGnQwFsYN6CcopUuTFjKD3mDZvDhTDYo/OcnfX80Ol6kWeniAYmmlZzd QHYcVyHY7l26N1y0IqbJtGoFFvyTeJ3M5rI9IkJxnlbbKoT1xsl6QjaKAMDM5AhmzjPW+Es328U/0t 9mQzPmBK12iuWK/KYiuDtF6fYlJ2K+ X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org In SME the use of both streaming SVE mode and ZA are tracked through PSTATE.SM and PSTATE.ZA, visible through the system register SVCR. In order to context switch the floating point state for SME we need to context switch the contents of this register as part of context switching the floating point state. Since changing the vector length exits streaming SVE mode and disables ZA we also make sure we update SVCR appropriately when setting vector length, and similarly ensure that new threads have streaming SVE mode and ZA disabled. Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 3 ++- arch/arm64/include/asm/processor.h | 1 + arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/fpsimd.c | 18 +++++++++++++++++- arch/arm64/kernel/process.c | 2 ++ arch/arm64/kvm/fpsimd.c | 4 ++++ 6 files changed, 27 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index babf944e7c0c..d1bae65d3dba 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -46,7 +46,8 @@ extern void fpsimd_restore_current_state(void); extern void fpsimd_update_current_state(struct user_fpsimd_state const *state); extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state, - void *sve_state, unsigned int sve_vl); + void *sve_state, unsigned int sve_vl, + u64 *svcr); extern void fpsimd_flush_task_state(struct task_struct *target); extern void fpsimd_save_and_flush_cpu_state(void); diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 008a1767ebff..7e08a4d48c24 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -168,6 +168,7 @@ struct thread_struct { u64 mte_ctrl; #endif u64 sctlr_user; + u64 svcr; u64 tpidr2_el0; }; diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 4e6b58dcd6f9..848739c15de8 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -82,6 +82,7 @@ int arch_dup_task_struct(struct task_struct *dst, #define TIF_SVE_VL_INHERIT 24 /* Inherit SVE vl_onexec across exec */ #define TIF_SSBD 25 /* Wants SSB mitigation */ #define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */ +#define TIF_SME 27 /* SME in use */ #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 40ef89120774..a1918b71d335 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -121,6 +121,7 @@ struct fpsimd_last_state_struct { struct user_fpsimd_state *st; void *sve_state; + u64 *svcr; unsigned int sve_vl; }; @@ -359,6 +360,9 @@ static void task_fpsimd_load(void) WARN_ON(!system_supports_fpsimd()); WARN_ON(!have_cpu_fpsimd_context()); + if (IS_ENABLED(CONFIG_ARM64_SME) && test_thread_flag(TIF_SME)) + write_sysreg_s(current->thread.svcr, SYS_SVCR_EL0); + if (IS_ENABLED(CONFIG_ARM64_SVE) && test_thread_flag(TIF_SVE)) { sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1); sve_load_state(sve_pffr(¤t->thread), @@ -384,6 +388,12 @@ static void fpsimd_save(void) if (test_thread_flag(TIF_FOREIGN_FPSTATE)) return; + if (IS_ENABLED(CONFIG_ARM64_SME) && + test_thread_flag(TIF_SME)) { + u64 *svcr = last->svcr; + *svcr = read_sysreg_s(SYS_SVCR_EL0); + } + if (IS_ENABLED(CONFIG_ARM64_SVE) && test_thread_flag(TIF_SVE)) { if (WARN_ON(sve_get_vl() != last->sve_vl)) { @@ -735,6 +745,10 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type, if (test_and_clear_tsk_thread_flag(task, TIF_SVE)) sve_to_fpsimd(task); + if (system_supports_sme() && type == ARM64_VEC_SME) + task->thread.svcr &= ~(SYS_SVCR_EL0_SM_MASK | + SYS_SVCR_EL0_ZA_MASK); + if (task == current) put_cpu_fpsimd_context(); @@ -1398,6 +1412,7 @@ static void fpsimd_bind_task_to_cpu(void) last->st = ¤t->thread.uw.fpsimd_state; last->sve_state = current->thread.sve_state; last->sve_vl = task_get_sve_vl(current); + last->svcr = ¤t->thread.svcr; current->thread.fpsimd_cpu = smp_processor_id(); if (system_supports_sve()) { @@ -1412,7 +1427,7 @@ static void fpsimd_bind_task_to_cpu(void) } void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, - unsigned int sve_vl) + unsigned int sve_vl, u64 *svcr) { struct fpsimd_last_state_struct *last = this_cpu_ptr(&fpsimd_last_state); @@ -1421,6 +1436,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, WARN_ON(!in_softirq() && !irqs_disabled()); last->st = st; + last->svcr = svcr; last->sve_state = sve_state; last->sve_vl = sve_vl; } diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index e69a3dcdb0d9..f2d32a29641c 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -310,6 +310,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) dst->thread.sve_state = NULL; clear_tsk_thread_flag(dst, TIF_SVE); + dst->thread.svcr = 0; + /* clear any pending asynchronous tag fault raised by the parent */ clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT); diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 2f48fd362a8c..04698c4bcd30 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -103,6 +103,10 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) WARN_ON_ONCE(!irqs_disabled()); if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) { + /* + * Currently we do not support SME guests so SVCR is + * always 0 and we just need a variable to point to. + */ fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs, vcpu->arch.sve_state, vcpu->arch.sve_max_vl); From patchwork Wed Jan 26 15:10:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42FECC28CF5 for ; Wed, 26 Jan 2022 15:12:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242600AbiAZPMf (ORCPT ); Wed, 26 Jan 2022 10:12:35 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:41020 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbiAZPMf (ORCPT ); Wed, 26 Jan 2022 10:12:35 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 34B0661463 for ; Wed, 26 Jan 2022 15:12:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 39A66C36AE3; Wed, 26 Jan 2022 15:12:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209954; bh=FbM5omNpKVlWYHj4CpEGaUUeTfnFSez5FdQgrP8X7BI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kKLygm2uoVTl5QjMGR4bmBhYj5mTaH0P7D3OKQ6L897ArykA2nnlBIKr5kAu7Gy/u jHu02MSDFlgr4UFmbViBGnaZQHxO853FMofrNG+MyZzePr2Um18XWybm4idIOLfLM6 F6xq5ZHjcc7XkDdNqbHEnJwx17STGYoVad2XpHK7a6tlh6zp0cMDK4aylor6AjKOC8 Tv9OPloSnQBc4iL/6Vlg0RmSQDt4t3OspWwd1s5jyYFAat5nUu9/J6fLbAM1NEsk9q LcUQAl8YdXj3Dmm/sB3dY7nl2CiWsq+igR7EUNAhe7kxhIgsqnMkNK84qWHnNMZmtS hMDlMiphoB5Cg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 15/40] arm64/sme: Implement streaming SVE context switching Date: Wed, 26 Jan 2022 15:10:55 +0000 Message-Id: <20220126151120.3811248-16-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=12873; h=from:subject; bh=FbM5omNpKVlWYHj4CpEGaUUeTfnFSez5FdQgrP8X7BI=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSCM4HwaIbkwMOJ8Gp+ywzMOTGQSE4vsjmP7EyU 9JyhDDuJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkggAKCRAk1otyXVSH0EUzB/ 9rOIrvRmu3IFylTx/sZ6oahOLGZdcTd2diWaJjjFSnoUadZwdQaVTCBdthHN6WXSSBMwJo4ErZFfhg xdWGAsWARfa0X5qnoyfA0z0qtQGXhXOFLF+VD6CDiZDcyi6w2eMKwZVvit2bxv1kvZquijr2TQiHLF m2EVLQ0wfPJ6nAfdtlySbfn3YzDOldn9JV9UJRJkwYS8u4pfpNjg6v5t6k3HIcZSOC5ssDr1CRvOts dmqBe9+NRHq3WN0QYmzeEnHmJZ/ppnKvv6XnqdCDxtPJHfN/iRe08rJ5BFzbCQ4qGMhniDl0+uTK0Q csjzAS7OrP2FfWzfn2cpjwiEHBiyYb X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When in streaming mode we need to save and restore the streaming mode SVE register state rather than the regular SVE register state. This uses the streaming mode vector length and omits FFR but is otherwise identical, if TIF_SVE is enabled when we are in streaming mode then streaming mode takes precedence. This does not handle use of streaming SVE state with KVM, ptrace or signals. This will be updated in further patches. Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 22 +++++- arch/arm64/include/asm/fpsimdmacros.h | 11 +++ arch/arm64/include/asm/processor.h | 10 +++ arch/arm64/kernel/entry-fpsimd.S | 5 ++ arch/arm64/kernel/fpsimd.c | 109 +++++++++++++++++++++----- arch/arm64/kvm/fpsimd.c | 3 +- 6 files changed, 137 insertions(+), 23 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index d1bae65d3dba..650ec642159c 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -47,11 +47,21 @@ extern void fpsimd_update_current_state(struct user_fpsimd_state const *state); extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state, void *sve_state, unsigned int sve_vl, - u64 *svcr); + unsigned int sme_vl, u64 *svcr); extern void fpsimd_flush_task_state(struct task_struct *target); extern void fpsimd_save_and_flush_cpu_state(void); +static inline bool thread_sm_enabled(struct thread_struct *thread) +{ + return system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_SM_MASK); +} + +static inline bool thread_za_enabled(struct thread_struct *thread) +{ + return system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_ZA_MASK); +} + /* Maximum VL that SVE/SME VL-agnostic software can transparently support */ #define VL_ARCH_MAX 0x100 @@ -63,7 +73,14 @@ static inline size_t sve_ffr_offset(int vl) static inline void *sve_pffr(struct thread_struct *thread) { - return (char *)thread->sve_state + sve_ffr_offset(thread_get_sve_vl(thread)); + unsigned int vl; + + if (system_supports_sme() && thread_sm_enabled(thread)) + vl = thread_get_sme_vl(thread); + else + vl = thread_get_sve_vl(thread); + + return (char *)thread->sve_state + sve_ffr_offset(vl); } extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr); @@ -72,6 +89,7 @@ extern void sve_load_state(void const *state, u32 const *pfpsr, extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1); extern unsigned int sve_get_vl(void); extern void sve_set_vq(unsigned long vq_minus_1); +extern void sme_set_vq(unsigned long vq_minus_1); struct arm64_cpu_capabilities; extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused); diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h index 11c426ddd62c..a50fa1eab730 100644 --- a/arch/arm64/include/asm/fpsimdmacros.h +++ b/arch/arm64/include/asm/fpsimdmacros.h @@ -261,6 +261,17 @@ 921: .endm +/* Update SMCR_EL1.LEN with the new VQ */ +.macro sme_load_vq xvqminus1, xtmp, xtmp2 + mrs_s \xtmp, SYS_SMCR_EL1 + bic \xtmp2, \xtmp, SMCR_ELx_LEN_MASK + orr \xtmp2, \xtmp2, \xvqminus1 + cmp \xtmp2, \xtmp + b.eq 921f + msr_s SYS_SMCR_EL1, \xtmp2 //self-synchronising +921: +.endm + /* Preserve the first 128-bits of Znz and zero the rest. */ .macro _sve_flush_z nz _sve_check_zreg \nz diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 7e08a4d48c24..6e2af9de153c 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -183,6 +183,11 @@ static inline unsigned int thread_get_sve_vl(struct thread_struct *thread) return thread_get_vl(thread, ARM64_VEC_SVE); } +static inline unsigned int thread_get_sme_vl(struct thread_struct *thread) +{ + return thread_get_vl(thread, ARM64_VEC_SME); +} + unsigned int task_get_vl(const struct task_struct *task, enum vec_type type); void task_set_vl(struct task_struct *task, enum vec_type type, unsigned long vl); @@ -196,6 +201,11 @@ static inline unsigned int task_get_sve_vl(const struct task_struct *task) return task_get_vl(task, ARM64_VEC_SVE); } +static inline unsigned int task_get_sme_vl(const struct task_struct *task) +{ + return task_get_vl(task, ARM64_VEC_SME); +} + static inline void task_set_sve_vl(struct task_struct *task, unsigned long vl) { task_set_vl(task, ARM64_VEC_SVE, vl); diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index deee5f01462e..6f88c0f86d50 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -94,4 +94,9 @@ SYM_FUNC_START(sme_get_vl) ret SYM_FUNC_END(sme_get_vl) +SYM_FUNC_START(sme_set_vq) + sme_load_vq x0, x1, x2 + ret +SYM_FUNC_END(sme_set_vq) + #endif /* CONFIG_ARM64_SME */ diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index a1918b71d335..12fef62cf07a 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -123,6 +123,7 @@ struct fpsimd_last_state_struct { void *sve_state; u64 *svcr; unsigned int sve_vl; + unsigned int sme_vl; }; static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state); @@ -301,17 +302,28 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type, task->thread.vl_onexec[type] = vl; } +/* + * TIF_SME controls whether a task can use SME without trapping while + * in userspace, when TIF_SME is set then we must have storage + * alocated in sve_state and za_state to store the contents of both ZA + * and the SVE registers for both streaming and non-streaming modes. + * + * If both SVCR.ZA and SVCR.SM are disabled then at any point we + * may disable TIF_SME and reenable traps. + */ + + /* * TIF_SVE controls whether a task can use SVE without trapping while - * in userspace, and also the way a task's FPSIMD/SVE state is stored - * in thread_struct. + * in userspace, and also (together with TIF_SME) the way a task's + * FPSIMD/SVE state is stored in thread_struct. * * The kernel uses this flag to track whether a user task is actively * using SVE, and therefore whether full SVE register state needs to * be tracked. If not, the cheaper FPSIMD context handling code can * be used instead of the more costly SVE equivalents. * - * * TIF_SVE set: + * * TIF_SVE or SVCR.SM set: * * The task can execute SVE instructions while in userspace without * trapping to the kernel. @@ -319,7 +331,8 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type, * When stored, Z0-Z31 (incorporating Vn in bits[127:0] or the * corresponding Zn), P0-P15 and FFR are encoded in in * task->thread.sve_state, formatted appropriately for vector - * length task->thread.sve_vl. + * length task->thread.sve_vl or, if SVCR.SM is set, + * task->thread.sme_vl. * * task->thread.sve_state must point to a valid buffer at least * sve_state_size(task) bytes in size. @@ -357,19 +370,40 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type, */ static void task_fpsimd_load(void) { + bool restore_sve_regs = false; + bool restore_ffr; + WARN_ON(!system_supports_fpsimd()); WARN_ON(!have_cpu_fpsimd_context()); - if (IS_ENABLED(CONFIG_ARM64_SME) && test_thread_flag(TIF_SME)) - write_sysreg_s(current->thread.svcr, SYS_SVCR_EL0); - + /* Check if we should restore SVE first */ if (IS_ENABLED(CONFIG_ARM64_SVE) && test_thread_flag(TIF_SVE)) { sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1); + restore_sve_regs = true; + restore_ffr = true; + } + + /* Restore SME, override SVE register configuration if needed */ + if (system_supports_sme()) { + unsigned long sme_vl = task_get_sme_vl(current); + + if (test_thread_flag(TIF_SME)) + sme_set_vq(sve_vq_from_vl(sme_vl) - 1); + + write_sysreg_s(current->thread.svcr, SYS_SVCR_EL0); + + if (thread_sm_enabled(¤t->thread)) { + restore_sve_regs = true; + restore_ffr = system_supports_fa64(); + } + } + + if (restore_sve_regs) sve_load_state(sve_pffr(¤t->thread), - ¤t->thread.uw.fpsimd_state.fpsr, true); - } else { + ¤t->thread.uw.fpsimd_state.fpsr, + restore_ffr); + else fpsimd_load_state(¤t->thread.uw.fpsimd_state); - } } /* @@ -381,6 +415,9 @@ static void fpsimd_save(void) struct fpsimd_last_state_struct const *last = this_cpu_ptr(&fpsimd_last_state); /* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */ + bool save_sve_regs = false; + bool save_ffr; + unsigned int vl; WARN_ON(!system_supports_fpsimd()); WARN_ON(!have_cpu_fpsimd_context()); @@ -388,15 +425,33 @@ static void fpsimd_save(void) if (test_thread_flag(TIF_FOREIGN_FPSTATE)) return; - if (IS_ENABLED(CONFIG_ARM64_SME) && - test_thread_flag(TIF_SME)) { + if (test_thread_flag(TIF_SVE)) { + save_sve_regs = true; + save_ffr = true; + vl = last->sve_vl; + } + + if (system_supports_sme()) { u64 *svcr = last->svcr; *svcr = read_sysreg_s(SYS_SVCR_EL0); + + if (thread_za_enabled(¤t->thread)) { + /* ZA state managment is not implemented yet */ + force_signal_inject(SIGKILL, SI_KERNEL, 0, 0); + return; + } + + /* If we are in streaming mode override regular SVE. */ + if (*svcr & SYS_SVCR_EL0_SM_MASK) { + save_sve_regs = true; + save_ffr = system_supports_fa64(); + vl = last->sme_vl; + } } - if (IS_ENABLED(CONFIG_ARM64_SVE) && - test_thread_flag(TIF_SVE)) { - if (WARN_ON(sve_get_vl() != last->sve_vl)) { + if (IS_ENABLED(CONFIG_ARM64_SVE) && save_sve_regs) { + /* Get the configured VL from RDVL, will account for SM */ + if (WARN_ON(sve_get_vl() != vl)) { /* * Can't save the user regs, so current would * re-enter user with corrupt state. @@ -407,8 +462,8 @@ static void fpsimd_save(void) } sve_save_state((char *)last->sve_state + - sve_ffr_offset(last->sve_vl), - &last->st->fpsr, true); + sve_ffr_offset(vl), + &last->st->fpsr, save_ffr); } else { fpsimd_save_state(last->st); } @@ -613,7 +668,14 @@ static void sve_to_fpsimd(struct task_struct *task) */ static size_t sve_state_size(struct task_struct const *task) { - return SVE_SIG_REGS_SIZE(sve_vq_from_vl(task_get_sve_vl(task))); + unsigned int vl = 0; + + if (system_supports_sve()) + vl = task_get_sve_vl(task); + if (system_supports_sme()) + vl = max(vl, task_get_sme_vl(task)); + + return SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)); } /* @@ -742,7 +804,8 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type, } fpsimd_flush_task_state(task); - if (test_and_clear_tsk_thread_flag(task, TIF_SVE)) + if (test_and_clear_tsk_thread_flag(task, TIF_SVE) || + thread_sm_enabled(&task->thread)) sve_to_fpsimd(task); if (system_supports_sme() && type == ARM64_VEC_SME) @@ -1369,6 +1432,9 @@ void fpsimd_flush_thread(void) fpsimd_flush_thread_vl(ARM64_VEC_SVE); } + if (system_supports_sme()) + fpsimd_flush_thread_vl(ARM64_VEC_SME); + put_cpu_fpsimd_context(); } @@ -1412,6 +1478,7 @@ static void fpsimd_bind_task_to_cpu(void) last->st = ¤t->thread.uw.fpsimd_state; last->sve_state = current->thread.sve_state; last->sve_vl = task_get_sve_vl(current); + last->sme_vl = task_get_sme_vl(current); last->svcr = ¤t->thread.svcr; current->thread.fpsimd_cpu = smp_processor_id(); @@ -1427,7 +1494,8 @@ static void fpsimd_bind_task_to_cpu(void) } void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, - unsigned int sve_vl, u64 *svcr) + unsigned int sve_vl, unsigned int sme_vl, + u64 *svcr) { struct fpsimd_last_state_struct *last = this_cpu_ptr(&fpsimd_last_state); @@ -1439,6 +1507,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, last->svcr = svcr; last->sve_state = sve_state; last->sve_vl = sve_vl; + last->sme_vl = sme_vl; } /* diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 04698c4bcd30..902c598b7ed2 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -109,7 +109,8 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) */ fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs, vcpu->arch.sve_state, - vcpu->arch.sve_max_vl); + vcpu->arch.sve_max_vl, + 0); clear_thread_flag(TIF_FOREIGN_FPSTATE); update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu)); From patchwork Wed Jan 26 15:10:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B42EC3526D for ; Wed, 26 Jan 2022 15:13:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242571AbiAZPNj (ORCPT ); Wed, 26 Jan 2022 10:13:39 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:41350 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242607AbiAZPMk (ORCPT ); Wed, 26 Jan 2022 10:12:40 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AE6C4B81CA7 for ; Wed, 26 Jan 2022 15:12:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1E1E7C340E6; Wed, 26 Jan 2022 15:12:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209958; bh=ozE2gweVnYm8+WGvKFBpmz6Wvw67yql2wccmrToa5wo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ENSOmwE8JZEln3pD0wJb88xDkLfb/mfJCUoRxY2HoVObjxlwoe3khvgE7R+WtXE0Q Ffwi4N1qHcvrX6SpXCJe3vtClFLfEPerVzBaxJoLCZuYsy2jm59XgiK6oHfWC0L9nl xw9pCKMaNu/oWhsoxqR78KSUe93Z4H3cnnZGyzq2jnFrzVu2fiYjYRU2NRvd2kuOI9 yqmN6X6XwM31+z8KxY+ptUVcvkyTWvMdq1do9cwvQeztaxlvmA/TNewIyB3UV/T6YT acJFnRqMcPoUv+krYdCdsrkW+jhu6cJO0FRL4nE6Amg0KsSpm0k0kVLdoM/zro+4bh uYykUQqtw1DZw== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 16/40] arm64/sme: Implement ZA context switching Date: Wed, 26 Jan 2022 15:10:56 +0000 Message-Id: <20220126151120.3811248-17-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=7644; h=from:subject; bh=ozE2gweVnYm8+WGvKFBpmz6Wvw67yql2wccmrToa5wo=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSDiazh0f/KJVb2hmtlqZufPDDGhlyCHbuaNZFz dX5MOB6JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkgwAKCRAk1otyXVSH0PgJB/ 9mm/03EEs1ZXrhWhqdTFvKLmy4viRkdEMQTbLfs260O6IGRncTUL8vL4QepqhTV7dik8MJ1J9hiDEt rO3itClZzOqMru52xd/hyY44v9b8J8vCPn8A2BRZYtAt4TNa8CEZaaaLLRXnk1AmH5C9Hq5ajOlmqQ ql8rzTYgIQXzCQ4LqXJEwSg+yKao1+e9XNrYkgddhM8KWRUEN+13tHOvrM/eczx7xHhgkSyD7jhj7A dPkgy3ge/gbeGQAFiICrFNRalmEuEIzsfV3TzFtq1hFbnQa+Sek3SeI+C1Qb6acvdXRdCbP3W9sZ74 zRhRVS2J9gORpLfgxnTgXGe5u4tNA/ X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Allocate space for storing ZA on first access to SME and use that to save and restore ZA state when context switching. We do this by using the vector form of the LDR and STR ZA instructions, these do not require streaming mode and have implementation recommendations that they avoid contention issues in shared SMCU implementations. Since ZA is architecturally guaranteed to be zeroed when enabled we do not need to explicitly zero ZA, either we will be restoring from a saved copy or trapping on first use of SME so we know that ZA must be disabled. Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 5 ++++- arch/arm64/include/asm/fpsimdmacros.h | 22 ++++++++++++++++++++++ arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/include/asm/processor.h | 1 + arch/arm64/kernel/entry-fpsimd.S | 22 ++++++++++++++++++++++ arch/arm64/kernel/fpsimd.c | 20 +++++++++++++------- arch/arm64/kvm/fpsimd.c | 2 +- 7 files changed, 66 insertions(+), 9 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 650ec642159c..af404e5b8d82 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -47,7 +47,8 @@ extern void fpsimd_update_current_state(struct user_fpsimd_state const *state); extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state, void *sve_state, unsigned int sve_vl, - unsigned int sme_vl, u64 *svcr); + void *za_state, unsigned int sme_vl, + u64 *svcr); extern void fpsimd_flush_task_state(struct task_struct *target); extern void fpsimd_save_and_flush_cpu_state(void); @@ -90,6 +91,8 @@ extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1); extern unsigned int sve_get_vl(void); extern void sve_set_vq(unsigned long vq_minus_1); extern void sme_set_vq(unsigned long vq_minus_1); +extern void za_save_state(void *state); +extern void za_load_state(void const *state); struct arm64_cpu_capabilities; extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused); diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h index a50fa1eab730..35d67e393519 100644 --- a/arch/arm64/include/asm/fpsimdmacros.h +++ b/arch/arm64/include/asm/fpsimdmacros.h @@ -318,3 +318,25 @@ ldr w\nxtmp, [\xpfpsr, #4] msr fpcr, x\nxtmp .endm + +.macro sme_save_za nxbase, xvl, nw + mov w\nw, #0 + +423: + _sme_str_zav \nw, \nxbase + add x\nxbase, x\nxbase, \xvl + add x\nw, x\nw, #1 + cmp \xvl, x\nw + bne 423b +.endm + +.macro sme_load_za nxbase, xvl, nw + mov w\nw, #0 + +423: + _sme_ldr_zav \nw, \nxbase + add x\nxbase, x\nxbase, \xvl + add x\nw, x\nw, #1 + cmp \xvl, x\nw + bne 423b +.endm diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 5bc01e62c08a..7dc85d5a6552 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -280,8 +280,11 @@ struct vcpu_reset_state { struct kvm_vcpu_arch { struct kvm_cpu_context ctxt; + + /* Guest floating point state */ void *sve_state; unsigned int sve_max_vl; + u64 svcr; /* Stage 2 paging state used by the hardware on next switch */ struct kvm_s2_mmu *hw_mmu; diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 6e2af9de153c..5a5c5edd76df 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -153,6 +153,7 @@ struct thread_struct { unsigned int fpsimd_cpu; void *sve_state; /* SVE registers, if any */ + void *za_state; /* ZA register, if any */ unsigned int vl[ARM64_VEC_MAX]; /* vector length */ unsigned int vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */ unsigned long fault_address; /* fault info */ diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index 6f88c0f86d50..229436f33df5 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -99,4 +99,26 @@ SYM_FUNC_START(sme_set_vq) ret SYM_FUNC_END(sme_set_vq) +/* + * Save the SME state + * + * x0 - pointer to buffer for state + */ +SYM_FUNC_START(za_save_state) + _sme_rdsvl 1, 1 // x1 = VL/8 + sme_save_za 0, x1, 12 + ret +SYM_FUNC_END(za_save_state) + +/* + * Load the SME state + * + * x0 - pointer to buffer for state + */ +SYM_FUNC_START(za_load_state) + _sme_rdsvl 1, 1 // x1 = VL/8 + sme_load_za 0, x1, 12 + ret +SYM_FUNC_END(za_load_state) + #endif /* CONFIG_ARM64_SME */ diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 12fef62cf07a..c9e8186e69c0 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -121,6 +121,7 @@ struct fpsimd_last_state_struct { struct user_fpsimd_state *st; void *sve_state; + void *za_state; u64 *svcr; unsigned int sve_vl; unsigned int sme_vl; @@ -387,11 +388,15 @@ static void task_fpsimd_load(void) if (system_supports_sme()) { unsigned long sme_vl = task_get_sme_vl(current); + /* Ensure VL is set up for restoring data */ if (test_thread_flag(TIF_SME)) sme_set_vq(sve_vq_from_vl(sme_vl) - 1); write_sysreg_s(current->thread.svcr, SYS_SVCR_EL0); + if (thread_za_enabled(¤t->thread)) + za_load_state(current->thread.za_state); + if (thread_sm_enabled(¤t->thread)) { restore_sve_regs = true; restore_ffr = system_supports_fa64(); @@ -435,11 +440,10 @@ static void fpsimd_save(void) u64 *svcr = last->svcr; *svcr = read_sysreg_s(SYS_SVCR_EL0); - if (thread_za_enabled(¤t->thread)) { - /* ZA state managment is not implemented yet */ - force_signal_inject(SIGKILL, SI_KERNEL, 0, 0); - return; - } + *svcr = read_sysreg_s(SYS_SVCR_EL0); + + if (*svcr & SYS_SVCR_EL0_ZA_MASK) + za_save_state(last->za_state); /* If we are in streaming mode override regular SVE. */ if (*svcr & SYS_SVCR_EL0_SM_MASK) { @@ -1477,6 +1481,7 @@ static void fpsimd_bind_task_to_cpu(void) WARN_ON(!system_supports_fpsimd()); last->st = ¤t->thread.uw.fpsimd_state; last->sve_state = current->thread.sve_state; + last->za_state = current->thread.za_state; last->sve_vl = task_get_sve_vl(current); last->sme_vl = task_get_sme_vl(current); last->svcr = ¤t->thread.svcr; @@ -1494,8 +1499,8 @@ static void fpsimd_bind_task_to_cpu(void) } void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, - unsigned int sve_vl, unsigned int sme_vl, - u64 *svcr) + unsigned int sve_vl, void *za_state, + unsigned int sme_vl, u64 *svcr) { struct fpsimd_last_state_struct *last = this_cpu_ptr(&fpsimd_last_state); @@ -1506,6 +1511,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, last->st = st; last->svcr = svcr; last->sve_state = sve_state; + last->za_state = za_state; last->sve_vl = sve_vl; last->sme_vl = sme_vl; } diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 902c598b7ed2..338733ac63f8 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -110,7 +110,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs, vcpu->arch.sve_state, vcpu->arch.sve_max_vl, - 0); + NULL, 0, &vcpu->arch.svcr); clear_thread_flag(TIF_FOREIGN_FPSTATE); update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu)); From patchwork Wed Jan 26 15:10:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537366 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9D24C3526D for ; Wed, 26 Jan 2022 15:12:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242647AbiAZPMq (ORCPT ); Wed, 26 Jan 2022 10:12:46 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:41374 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242634AbiAZPMp (ORCPT ); Wed, 26 Jan 2022 10:12:45 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AF263B81E3A for ; Wed, 26 Jan 2022 15:12:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 05F47C340EB; Wed, 26 Jan 2022 15:12:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209962; bh=dxZojSg+AyGMNDiyUywnEA9WS18lR7uN1j1lZgSyfRs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q5daYwqSsSa0kktiYLWV1ZgVs35Drfnt+4dAvCD2Q95wNnGrhkana4nwdBCRh0I/d wqRUNZizWGNopH+Oy6ccrXRJU8U+/w9EHKUVeCLac+MC5OpWIcbv7Pyor1rErcSSCb 1XoJfjqANvcwWVakYjx/gMykuJolKaoCL8iL9E8Uix0vPPJKcZ8t3740FSPeoPqiIi 1YC8DY2oZM5Rv1EXyookfTW5Y6I2SZWAjjb7XzKWjG/c8N3/Iqx4m/UkF57A2dk7K6 bSocidQSiFygSfYA3oHPa860ZZOvGzUYIGXqNL/WUXnh7J5NgLEm8+LzCZ8qyVICDU ZrUx20UF4SlAQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 17/40] arm64/sme: Implement traps and syscall handling for SME Date: Wed, 26 Jan 2022 15:10:57 +0000 Message-Id: <20220126151120.3811248-18-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=16748; h=from:subject; bh=dxZojSg+AyGMNDiyUywnEA9WS18lR7uN1j1lZgSyfRs=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSEruvOzzUQqJA2oxrU5e+/NfW9xtUspy1kBoF0 M0t/lZmJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkhAAKCRAk1otyXVSH0Gz5B/ 9WSj0qiEzTHRD3NzMRZne3+hc7N3IVQ0oR6r6Hmt2NePKBaSwxhUj8Bx294736siXV+EjfCLWhjuPl 4YiiCrQSdbmX+JddNv8C3CRSSKVPP3ynhGiikhbuFgVZrUYi6IQ6iQzN+yh6chPhgeGdCA7XOEzviE MeGn3lGMA+r+qJnQG+AbA5G1prayl0yqDqrfGgIet7ORzrW43P/dQV25ufAAdw4uVabKOTy96/SB6b /yBHYwaf1+gXbkqKVKqaCeY+tYFh/uE3kssZA/dDclUFdkZtb3SfWnlOEyoFrQGuZbupyn8YJzjsVK KtzkhQpAac71ICF+lqurO36OccuhPI X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org By default all SME operations in userspace will trap. When this happens we allocate storage space for the SME register state, set up the SVE registers and disable traps. We do not need to initialize ZA since the architecture guarantees that it will be zeroed when enabled and when we trap ZA is disabled. On syscall we exit streaming mode if we were previously in it and ensure that all but the lower 128 bits of the registers are zeroed while preserving the state of ZA. This follows the aarch64 PCS for SME, ZA state is preserved over a function call and streaming mode is exited. Since the traps for SME do not distinguish between streaming mode SVE and ZA usage if ZA is in use rather than reenabling traps we instead zero the parts of the SVE registers not shared with FPSIMD and leave SME enabled, this simplifies handling SME traps. If ZA is not in use then we reenable SME traps and fall through to normal handling of SVE. Signed-off-by: Mark Brown --- arch/arm64/include/asm/esr.h | 1 + arch/arm64/include/asm/exception.h | 1 + arch/arm64/include/asm/fpsimd.h | 27 +++++ arch/arm64/kernel/entry-common.c | 11 ++ arch/arm64/kernel/fpsimd.c | 180 ++++++++++++++++++++++++++--- arch/arm64/kernel/process.c | 12 +- arch/arm64/kernel/syscall.c | 34 +++++- 7 files changed, 242 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 43872e0cfd1e..0467837fd66b 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -76,6 +76,7 @@ #define ESR_ELx_IL_SHIFT (25) #define ESR_ELx_IL (UL(1) << ESR_ELx_IL_SHIFT) #define ESR_ELx_ISS_MASK (ESR_ELx_IL - 1) +#define ESR_ELx_ISS(esr) ((esr) & ESR_ELx_ISS_MASK) /* ISS field definitions shared by different classes */ #define ESR_ELx_WNR_SHIFT (6) diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h index 339477dca551..2add7f33b7c2 100644 --- a/arch/arm64/include/asm/exception.h +++ b/arch/arm64/include/asm/exception.h @@ -64,6 +64,7 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr, struct pt_regs *regs); void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs); void do_sve_acc(unsigned int esr, struct pt_regs *regs); +void do_sme_acc(unsigned int esr, struct pt_regs *regs); void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs); void do_sysinstr(unsigned int esr, struct pt_regs *regs); void do_sp_pc_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs); diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index af404e5b8d82..3ca152aaf7c9 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -282,6 +282,16 @@ static inline void sve_setup(void) { } #ifdef CONFIG_ARM64_SME +static inline void sme_user_disable(void) +{ + sysreg_clear_set(cpacr_el1, CPACR_EL1_SMEN_EL0EN, 0); +} + +static inline void sme_user_enable(void) +{ + sysreg_clear_set(cpacr_el1, 0, CPACR_EL1_SMEN_EL0EN); +} + static inline void sme_smstart_sm(void) { asm volatile(".inst 0xd503437f"); @@ -309,16 +319,33 @@ static inline int sme_max_virtualisable_vl(void) return vec_max_virtualisable_vl(ARM64_VEC_SME); } +extern void sme_alloc(struct task_struct *task); extern unsigned int sme_get_vl(void); extern int sme_set_current_vl(unsigned long arg); extern int sme_get_current_vl(void); +/* + * Return how many bytes of memory are required to store the full SME + * specific state (currently just ZA) for task, given task's currently + * configured vector length. + */ +static inline size_t za_state_size(struct task_struct const *task) +{ + unsigned int vl = task_get_sme_vl(task); + + return ZA_SIG_REGS_SIZE(sve_vq_from_vl(vl)); +} + #else +static inline void sme_user_disable(void) { BUILD_BUG(); } +static inline void sme_user_enable(void) { BUILD_BUG(); } + static inline void sme_smstart_sm(void) { } static inline void sme_smstop_sm(void) { } static inline void sme_smstop(void) { } +static inline void sme_alloc(struct task_struct *task) { } static inline void sme_setup(void) { } static inline unsigned int sme_get_vl(void) { return 0; } static inline int sme_max_vl(void) { return 0; } diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index ef7fcefb96bd..b2ad525d6c87 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -524,6 +524,14 @@ static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr) exit_to_user_mode(regs); } +static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr) +{ + enter_from_user_mode(regs); + local_daif_restore(DAIF_PROCCTX); + do_sme_acc(esr, regs); + exit_to_user_mode(regs); +} + static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr) { enter_from_user_mode(regs); @@ -632,6 +640,9 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs) case ESR_ELx_EC_SVE: el0_sve_acc(regs, esr); break; + case ESR_ELx_EC_SME: + el0_sme_acc(regs, esr); + break; case ESR_ELx_EC_FP_EXC64: el0_fpsimd_exc(regs, esr); break; diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index c9e8186e69c0..33f80512753d 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -209,6 +209,12 @@ static void set_sme_default_vl(int val) set_default_vl(ARM64_VEC_SME, val); } +static void sme_free(struct task_struct *); + +#else + +static inline void sme_free(struct task_struct *t) { } + #endif DEFINE_PER_CPU(bool, fpsimd_context_busy); @@ -409,6 +415,21 @@ static void task_fpsimd_load(void) restore_ffr); else fpsimd_load_state(¤t->thread.uw.fpsimd_state); + + /* + * If we didn't set up any SVE registers but we do have SME + * enabled for userspace then ensure the SVE registers are + * flushed since userspace can switch to streaming mode and + * view the register state without trapping. + */ + if (system_supports_sme() && test_thread_flag(TIF_SME) && + !restore_sve_regs) { + int sve_vq_minus_one; + + sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1; + sve_set_vq(sve_vq_minus_one); + sve_flush_live(true, sve_vq_minus_one); + } } /* @@ -812,18 +833,22 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type, thread_sm_enabled(&task->thread)) sve_to_fpsimd(task); - if (system_supports_sme() && type == ARM64_VEC_SME) + if (system_supports_sme() && type == ARM64_VEC_SME) { task->thread.svcr &= ~(SYS_SVCR_EL0_SM_MASK | SYS_SVCR_EL0_ZA_MASK); + clear_thread_flag(TIF_SME); + } if (task == current) put_cpu_fpsimd_context(); /* - * Force reallocation of task SVE state to the correct size - * on next use: + * Force reallocation of task SVE and SME state to the correct + * size on next use: */ sve_free(task); + if (system_supports_sme() && type == ARM64_VEC_SME) + sme_free(task); task_set_vl(task, type, vl); @@ -1158,12 +1183,43 @@ void __init sve_setup(void) void fpsimd_release_task(struct task_struct *dead_task) { __sve_free(dead_task); + sme_free(dead_task); } #endif /* CONFIG_ARM64_SVE */ #ifdef CONFIG_ARM64_SME +/* This will move to uapi/asm/sigcontext.h when signals are implemented */ +#define ZA_SIG_REGS_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES)) + +/* + * Ensure that task->thread.za_state is allocated and sufficiently large. + * + * This function should be used only in preparation for replacing + * task->thread.za_state with new data. The memory is always zeroed + * here to prevent stale data from showing through: this is done in + * the interest of testability and predictability, the architecture + * guarantees that when ZA is enabled it will be zeroed. + */ +void sme_alloc(struct task_struct *task) +{ + if (task->thread.za_state) { + memset(task->thread.za_state, 0, za_state_size(task)); + return; + } + + /* This could potentially be up to 64K. */ + task->thread.za_state = + kzalloc(za_state_size(task), GFP_KERNEL); +} + +static void sme_free(struct task_struct *task) +{ + kfree(task->thread.za_state); + task->thread.za_state = NULL; +} + void sme_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p) { /* Set priority for all PEs to architecturally defined minimum */ @@ -1273,6 +1329,29 @@ void __init sme_setup(void) #endif /* CONFIG_ARM64_SME */ +static void sve_init_regs(void) +{ + /* + * Convert the FPSIMD state to SVE, zeroing all the state that + * is not shared with FPSIMD. If (as is likely) the current + * state is live in the registers then do this there and + * update our metadata for the current task including + * disabling the trap, otherwise update our in-memory copy. + * We are guaranteed to not be in streaming mode, we can only + * take a SVE trap when not in streaming mode and we can't be + * in streaming mode when taking a SME trap. + */ + if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { + unsigned long vq_minus_one = + sve_vq_from_vl(task_get_sve_vl(current)) - 1; + sve_set_vq(vq_minus_one); + sve_flush_live(true, vq_minus_one); + fpsimd_bind_task_to_cpu(); + } else { + fpsimd_to_sve(current); + } +} + /* * Trapped SVE access * @@ -1304,22 +1383,77 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs) WARN_ON(1); /* SVE access shouldn't have trapped */ /* - * Convert the FPSIMD state to SVE, zeroing all the state that - * is not shared with FPSIMD. If (as is likely) the current - * state is live in the registers then do this there and - * update our metadata for the current task including - * disabling the trap, otherwise update our in-memory copy. + * Even if the task can have used streaming mode we can only + * generate SVE access traps in normal SVE mode and + * transitioning out of streaming mode may discard any + * streaming mode state. Always clear the high bits to avoid + * any potential errors tracking what is properly initialised. */ + sve_init_regs(); + + put_cpu_fpsimd_context(); +} + +/* + * Trapped SME access + * + * Storage is allocated for the full SVE and SME state, the current + * FPSIMD register contents are migrated to SVE if SVE is not already + * active, and the access trap is disabled. + * + * TIF_SME should be clear on entry: otherwise, fpsimd_restore_current_state() + * would have disabled the SME access trap for userspace during + * ret_to_user, making an SVE access trap impossible in that case. + */ +void do_sme_acc(unsigned int esr, struct pt_regs *regs) +{ + /* Even if we chose not to use SME, the hardware could still trap: */ + if (unlikely(!system_supports_sme()) || WARN_ON(is_compat_task())) { + force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0); + return; + } + + /* + * If this not a trap due to SME being disabled then something + * is being used in the wrong mode, report as SIGILL. + */ + if (ESR_ELx_ISS(esr) != ESR_ELx_SME_ISS_SME_DISABLED) { + force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0); + return; + } + + sve_alloc(current); + sme_alloc(current); + if (!current->thread.sve_state || !current->thread.za_state) { + force_sig(SIGKILL); + return; + } + + get_cpu_fpsimd_context(); + + /* With TIF_SME userspace shouldn't generate any traps */ + if (test_and_set_thread_flag(TIF_SME)) + WARN_ON(1); + if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { unsigned long vq_minus_one = - sve_vq_from_vl(task_get_sve_vl(current)) - 1; - sve_set_vq(vq_minus_one); - sve_flush_live(true, vq_minus_one); + sve_vq_from_vl(task_get_sme_vl(current)) - 1; + sme_set_vq(vq_minus_one); + fpsimd_bind_task_to_cpu(); - } else { - fpsimd_to_sve(current); } + /* + * If SVE was not already active initialise the SVE registers, + * any non-shared state between the streaming and regular SVE + * registers is architecturally guaranteed to be zeroed when + * we enter streaming mode. We do not need to initialize ZA + * since ZA must be disabled at this point and enabling ZA is + * architecturally defined to zero ZA. + */ + if (system_supports_sve() && !test_thread_flag(TIF_SVE)) + sve_init_regs(); + put_cpu_fpsimd_context(); } @@ -1436,8 +1570,11 @@ void fpsimd_flush_thread(void) fpsimd_flush_thread_vl(ARM64_VEC_SVE); } - if (system_supports_sme()) + if (system_supports_sme()) { + clear_thread_flag(TIF_SME); + sme_free(current); fpsimd_flush_thread_vl(ARM64_VEC_SME); + } put_cpu_fpsimd_context(); } @@ -1487,15 +1624,24 @@ static void fpsimd_bind_task_to_cpu(void) last->svcr = ¤t->thread.svcr; current->thread.fpsimd_cpu = smp_processor_id(); + /* + * Toggle SVE and SME trapping for userspace if needed, these + * are serialsied by ret_to_user(). + */ + if (system_supports_sme()) { + if (test_thread_flag(TIF_SME)) + sme_user_enable(); + else + sme_user_disable(); + } + if (system_supports_sve()) { - /* Toggle SVE trapping for userspace if needed */ if (test_thread_flag(TIF_SVE)) sve_user_enable(); else sve_user_disable(); - - /* Serialised by exception return to user */ } + } void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state, diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index f2d32a29641c..f7fcc625ea0e 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -299,17 +299,19 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) BUILD_BUG_ON(!IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK)); /* - * Detach src's sve_state (if any) from dst so that it does not - * get erroneously used or freed prematurely. dst's sve_state - * will be allocated on demand later on if dst uses SVE. - * For consistency, also clear TIF_SVE here: this could be done + * Detach src's sve/za_state (if any) from dst so that it does not + * get erroneously used or freed prematurely. dst's copies + * will be allocated on demand later on if dst uses SVE/SME. + * For consistency, also clear TIF_SVE/SME here: this could be done * later in copy_process(), but to avoid tripping up future - * maintainers it is best not to leave TIF_SVE and sve_state in + * maintainers it is best not to leave TIF flags and buffers in * an inconsistent state, even temporarily. */ dst->thread.sve_state = NULL; clear_tsk_thread_flag(dst, TIF_SVE); + dst->thread.za_state = NULL; + clear_tsk_thread_flag(dst, TIF_SME); dst->thread.svcr = 0; /* clear any pending asynchronous tag fault raised by the parent */ diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c index c938603b3ba0..958b2d926354 100644 --- a/arch/arm64/kernel/syscall.c +++ b/arch/arm64/kernel/syscall.c @@ -158,11 +158,41 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr, syscall_trace_exit(regs); } -static inline void sve_user_discard(void) +/* + * As per the ABI exit SME streaming mode and clear the SVE state not + * shared with FPSIMD on syscall entry. + */ +static inline void fp_user_discard(void) { + /* + * If SME is active then exit streaming mode. If ZA is active + * then flush the SVE registers but leave userspace access to + * both SVE and SME enabled, otherwise disable SME for the + * task and fall through to disabling SVE too. This means + * that after a syscall we never have any SME register state + * to track, if this changes the KVM code will need updating. + */ + if (system_supports_sme() && test_thread_flag(TIF_SME)) { + u64 svcr = read_sysreg_s(SYS_SVCR_EL0); + + if (svcr & SYS_SVCR_EL0_SM_MASK) + sme_smstop_sm(); + + if (!(svcr & SYS_SVCR_EL0_ZA_MASK)) { + clear_thread_flag(TIF_SME); + sme_user_disable(); + } + } + + if (!system_supports_sve()) return; + /* + * If SME is not active then disable SVE, the registers will + * be cleared when userspace next attempts to access them and + * we do not need to track the SVE register state until then. + */ clear_thread_flag(TIF_SVE); /* @@ -177,7 +207,7 @@ static inline void sve_user_discard(void) void do_el0_svc(struct pt_regs *regs) { - sve_user_discard(); + fp_user_discard(); el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table); } From patchwork Wed Jan 26 15:10:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 757EEC28CF5 for ; Wed, 26 Jan 2022 15:12:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242609AbiAZPMt (ORCPT ); Wed, 26 Jan 2022 10:12:49 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:41410 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242606AbiAZPMs (ORCPT ); Wed, 26 Jan 2022 10:12:48 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7FD65B81EA4 for ; Wed, 26 Jan 2022 15:12:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DC7F4C340EC; Wed, 26 Jan 2022 15:12:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209966; bh=CyYorHNQCHBoqo8xQvMZzI3T4+46kiWpXhSL8tXeuog=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NBJGyYqHXEDMMPBM35xpXoLDbMRV3iLQYMg4PV0e5tgtMEx8xbNZBl8GtBW5MquN7 paUhmkKeR7UWd2s509KXZauZizDUqNXAdmrv13SaHofPz+Z+ysD6Uw7Whcv45Nfevm 6L1joTjGY+iX/GEZmqTss1wqDqwZMoNUXgUVVubYFjcnddvx/hIQT7C2reot4I0JMw mh8b+R1BaRkMkd9MXnejxKyE0bLHF28fb0cJlX9PAiPWYPEATEhKsICqcd9dcWrmL4 98l0NegA/SWmrNQy2OkjAJWQaTMTJPd1ARnd5g7pX/QMecNUymtGaBYEgU1qSO8h87 rC7GA0ZoMAZbg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 18/40] arm64/sme: Disable ZA and streaming mode when handling signals Date: Wed, 26 Jan 2022 15:10:58 +0000 Message-Id: <20220126151120.3811248-19-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=991; h=from:subject; bh=CyYorHNQCHBoqo8xQvMZzI3T4+46kiWpXhSL8tXeuog=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSEGMK5oeSY6nFUOtibYupakncbZg936OSHhw4h 4O/+R66JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkhAAKCRAk1otyXVSH0KamB/ 4pimLKdBRbJEaraXrVBKJqymrEQxdZ5oq/4KXqblXFnvx+vGJORilrgkGRLOvqolv/1hpAeVXtLB08 ok52z/hFl1XAu0jGpKB9HCAz5PrgYPzBTNS6p980JcUxj2ukwqFLAyM1nB5EpkOgsd687CFX8W9GSi //p6dsi28vGQbCKnW74bHAKWfhFGnMzGURAc1L/QfUzf43YamYSarYbxlpMJsLnXgr9N1en2Euhd8O mFlTjYcVXoEerrKMTidrmf8q7uGmWlGE9wKpQHxEZnzAUKw4ziqjf/UQ5/lDnIICEXInAjCyWJZUks VtyzDDRnHdq9JjUXDGdbMW5LZY86Wj X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The ABI requires that streaming mode and ZA are disabled when invoking signal handlers, do this in setup_return() when we prepare the task state for the signal handler. Signed-off-by: Mark Brown --- arch/arm64/kernel/signal.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index d8aaf4b6f432..cda04fd73333 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -758,6 +758,13 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka, /* TCO (Tag Check Override) always cleared for signal handlers */ regs->pstate &= ~PSR_TCO_BIT; + /* Signal handlers are invoked with ZA and streaming mode disabled */ + if (system_supports_sme()) { + current->thread.svcr &= ~(SYS_SVCR_EL0_ZA_MASK | + SYS_SVCR_EL0_SM_MASK); + sme_smstop(); + } + if (ka->sa.sa_flags & SA_RESTORER) sigtramp = ka->sa.sa_restorer; else From patchwork Wed Jan 26 15:10:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D7CCC2BA4C for ; Wed, 26 Jan 2022 15:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242617AbiAZPMv (ORCPT ); Wed, 26 Jan 2022 10:12:51 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:41220 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242606AbiAZPMv (ORCPT ); Wed, 26 Jan 2022 10:12:51 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id BC9F261842 for ; Wed, 26 Jan 2022 15:12:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C3557C36AE5; Wed, 26 Jan 2022 15:12:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209970; bh=aUBAeA8epzt4DxHOr/8Fwmtcfowksew8sOb3hVCUWlo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tQg0RGYcKhs4NfcZuSt4ISs1omMO/JD9KXWUmi9M87r/7vXQb47VVx/qqo6MS5q2O y8AUV8zgAuN1kqvEtqp/B0hrKSEq3oU1xyTl/mXuJTbw1kALVPwC97qd5i3DLotEVW Ho/05HeMwm88q3fWJikKq1vULHU7BHqNNhoV3i4sT+TRO0omqPjeHp1nyUGIFD6p9F Qu0tDnc7xqCHHDODWOCCVErzCzFysT0jS3Xinhrp+SyReQZ13sl7FSrCMOE4cF5n09 CkMoKqXdAC+I4UVe1RNauPb/HAJVNHBxkJrhecS1Ar/MkuwJfG7tSrEREMDebVx8nh IYuXT5uU64flQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 19/40] arm64/sme: Implement streaming SVE signal handling Date: Wed, 26 Jan 2022 15:10:59 +0000 Message-Id: <20220126151120.3811248-20-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6926; h=from:subject; bh=aUBAeA8epzt4DxHOr/8Fwmtcfowksew8sOb3hVCUWlo=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSFyZffwLSh1gpayP6gvSuF62lS+Td1u15fNbZg CYIMJzOJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkhQAKCRAk1otyXVSH0OUvB/ 0TR6YdELXyAS1onqXVrVMTEijC0/1dB2HQ1Tp+1FUKWoklYq5HODQCpF8uc6D6Ugw/xw9W8+9Ru88U mc8GOIlsbSB3nwKtLeAQNQI2yN751wBfyWwdyrzsKMDNQH4LTTF/R4hs48iW4nMZKrTUAfsDGJk/+l Z3jDRz25vl4BTdB1xZj3hFK3gDf5N7IHzbUckgGO662P+bZeW/WrbWGviRqgtS52+QbpEIMIhwo2CQ e8FIv0VAOx36nV/NuCVlCrtqVxMxO9sYxCfW3WKEyCgn7ewKMd89cEd3UNEAW8mn/FAEiI2MTbvx3W yX1AUlNIBh94BdwkBTBuWw1PStGOgr X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When in streaming mode we have the same set of SVE registers as we do in regular SVE mode with the exception of FFR and the use of the SME vector length. Provide signal handling for these registers by taking one of the reserved words in the SVE signal context as a flags field and defining a flag with a flag which is set for streaming mode. When the flag is set the vector length is set to the streaming mode vector length and we save and restore streaming mode data. We support entering or leaving streaming mode based on the value of the flag but do not support changing the vector length, this is not currently supported SVE signal handling. We could instead allocate a separate record in the signal frame for the streaming mode SVE context but this inflates the size of the maximal signal frame required and adds complication when validating signal frames from userspace, especially given the current structure of the code. Any implementation of support for streaming mode vectors in signals will have some potential for causing issues for applications that attempt to handle SVE vectors in signals, use streaming mode but do not understand streaming mode in their signal handling code, it is hard to identify a case that is clearly better than any other - they all have cases where they could cause unexpected register corruption or faults. Signed-off-by: Mark Brown --- arch/arm64/include/uapi/asm/sigcontext.h | 16 ++++++-- arch/arm64/kernel/signal.c | 48 ++++++++++++++++++------ 2 files changed, 50 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h index 0c796c795dbe..3a3366d4fbc2 100644 --- a/arch/arm64/include/uapi/asm/sigcontext.h +++ b/arch/arm64/include/uapi/asm/sigcontext.h @@ -134,9 +134,12 @@ struct extra_context { struct sve_context { struct _aarch64_ctx head; __u16 vl; - __u16 __reserved[3]; + __u16 flags; + __u16 __reserved[2]; }; +#define SVE_SIG_FLAG_SM 0x1 /* Context describes streaming mode */ + #endif /* !__ASSEMBLY__ */ #include @@ -186,9 +189,16 @@ struct sve_context { * sve_context.vl must equal the thread's current vector length when * doing a sigreturn. * + * On systems with support for SME the SVE register state may reflect either + * streaming or non-streaming mode. In streaming mode the streaming mode + * vector length will be used and the flag SVE_SIG_FLAG_SM will be set in + * the flags field. It is permitted to enter or leave streaming mode in + * a signal return, applications should take care to ensure that any difference + * in vector length between the two modes is handled, including any resixing + * and movement of context blocks. * - * Note: for all these macros, the "vq" argument denotes the SVE - * vector length in quadwords (i.e., units of 128 bits). + * Note: for all these macros, the "vq" argument denotes the vector length + * in quadwords (i.e., units of 128 bits). * * The correct way to obtain vq is to use sve_vq_from_vl(vl). The * result is valid if and only if sve_vl_valid(vl) is true. This is diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index cda04fd73333..5e6ec6ed8a1c 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -227,11 +227,17 @@ static int preserve_sve_context(struct sve_context __user *ctx) { int err = 0; u16 reserved[ARRAY_SIZE(ctx->__reserved)]; + u16 flags = 0; unsigned int vl = task_get_sve_vl(current); unsigned int vq = 0; - if (test_thread_flag(TIF_SVE)) + if (thread_sm_enabled(¤t->thread)) { + vl = task_get_sme_vl(current); vq = sve_vq_from_vl(vl); + flags |= SVE_SIG_FLAG_SM; + } else if (test_thread_flag(TIF_SVE)) { + vq = sve_vq_from_vl(vl); + } memset(reserved, 0, sizeof(reserved)); @@ -239,6 +245,7 @@ static int preserve_sve_context(struct sve_context __user *ctx) __put_user_error(round_up(SVE_SIG_CONTEXT_SIZE(vq), 16), &ctx->head.size, err); __put_user_error(vl, &ctx->vl, err); + __put_user_error(flags, &ctx->flags, err); BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved)); err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved)); @@ -259,18 +266,28 @@ static int preserve_sve_context(struct sve_context __user *ctx) static int restore_sve_fpsimd_context(struct user_ctxs *user) { int err; - unsigned int vq; + unsigned int vl, vq; struct user_fpsimd_state fpsimd; struct sve_context sve; if (__copy_from_user(&sve, user->sve, sizeof(sve))) return -EFAULT; - if (sve.vl != task_get_sve_vl(current)) + if (sve.flags & SVE_SIG_FLAG_SM) { + if (!system_supports_sme()) + return -EINVAL; + + vl = task_get_sme_vl(current); + } else { + vl = task_get_sve_vl(current); + } + + if (sve.vl != vl) return -EINVAL; if (sve.head.size <= sizeof(*user->sve)) { clear_thread_flag(TIF_SVE); + current->thread.svcr &= ~SYS_SVCR_EL0_SM_MASK; goto fpsimd_only; } @@ -302,7 +319,10 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user) if (err) return -EFAULT; - set_thread_flag(TIF_SVE); + if (sve.flags & SVE_SIG_FLAG_SM) + current->thread.svcr |= SYS_SVCR_EL0_SM_MASK; + else + set_thread_flag(TIF_SVE); fpsimd_only: /* copy the FP and status/control registers */ @@ -394,7 +414,7 @@ static int parse_user_sigframe(struct user_ctxs *user, break; case SVE_MAGIC: - if (!system_supports_sve()) + if (!system_supports_sve() && !system_supports_sme()) goto invalid; if (user->sve) @@ -593,11 +613,16 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user, if (system_supports_sve()) { unsigned int vq = 0; - if (add_all || test_thread_flag(TIF_SVE)) { - int vl = sve_max_vl(); + if (add_all || test_thread_flag(TIF_SVE) || + thread_sm_enabled(¤t->thread)) { + int vl = max(sve_max_vl(), sme_max_vl()); - if (!add_all) - vl = task_get_sve_vl(current); + if (!add_all) { + if (thread_sm_enabled(¤t->thread)) + vl = task_get_sme_vl(current); + else + vl = task_get_sve_vl(current); + } vq = sve_vq_from_vl(vl); } @@ -648,8 +673,9 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user, __put_user_error(current->thread.fault_code, &esr_ctx->esr, err); } - /* Scalable Vector Extension state, if present */ - if (system_supports_sve() && err == 0 && user->sve_offset) { + /* Scalable Vector Extension state (including streaming), if present */ + if ((system_supports_sve() || system_supports_sme()) && + err == 0 && user->sve_offset) { struct sve_context __user *sve_ctx = apply_user_offset(user, user->sve_offset); err |= preserve_sve_context(sve_ctx); From patchwork Wed Jan 26 15:11:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 829D0C3526D for ; Wed, 26 Jan 2022 15:12:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242623AbiAZPM4 (ORCPT ); Wed, 26 Jan 2022 10:12:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242619AbiAZPMz (ORCPT ); Wed, 26 Jan 2022 10:12:55 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 277B6C06161C for ; Wed, 26 Jan 2022 07:12:55 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AD1EE61868 for ; Wed, 26 Jan 2022 15:12:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A6CCBC36AEB; Wed, 26 Jan 2022 15:12:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209974; bh=oW8pCZKYXjjBMZgQkaxtsed3e5u7xxSj/4h8WMIhNyc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qSLL+UtEOrWe1LIFfLVEWJXlRRkXyyK2IcJuS1HzOUjNbAjnVOwCYEZtDjpoPsSDs BTQUu1Ih4Bl1/Bc/YBplWAj058dw6VixkWkPiEHIqCZFZ1hfljRhF5DXivlDodwlsj ist2KgH3GIEOUKOh35RReoVzyLlUVbIjfb1Ui0tFXSgyXHGu9/qLwYCIMIBDe7qjfv /XYG468hufzOe9iFonNJ5O2PatJBkw4GaILH4NloXt/+vl/8TqFRpw9DoWzpiO+LbQ DT7QeaVuWLlZP+wJuX+neCCZmowy325nXgukwNKPnKs6Wuv2zzAO6zfR8C1ZYkm2x3 KmkJAfYWd+mmg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 20/40] arm64/sme: Implement ZA signal handling Date: Wed, 26 Jan 2022 15:11:00 +0000 Message-Id: <20220126151120.3811248-21-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=8905; h=from:subject; bh=oW8pCZKYXjjBMZgQkaxtsed3e5u7xxSj/4h8WMIhNyc=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSG0hJdJxo+5Qop2soApSWgxTalG0mokuvaAz99 SzK+WkaJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkhgAKCRAk1otyXVSH0CeUCA CDqJsfpQukzZdjAq6Xj6HK3t3O3VElUr2f5ggjpi8nQgbHBvqdDxwxQ1LQj5LGIlGUnilvSfhcsc2o cddNH+ugz/HjtURTbwd8gN/g1PkBleg/PszlJOtjTQUnkNaBB4VNgtfueBj0Frvlw6YggAQL10ewUc /ylOwkLsbY7U59s7f81dm9GWGBzKEJRCNHMkEgCHKo8qNyPJA7qLv1JRi2MJY1plfcK/g0kbPg/eR0 lnLJwRYmzSPoIEfS3dR4GXhJX9EnhVIZOTh5BCZbb9jYoxmdopA5ZdaPEQEiTKf5lnnkjUIMHUUuPH QiMPjJ0hJjxqyENQ3zrhlpmZJ7pTiG X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Implement support for ZA in signal handling in a very similar way to how we implement support for SVE registers, using a signal context structure with optional register state after it. Where present this register state stores the ZA matrix as a series of horizontal vectors numbered from 0 to VL/8 in the endinanness independent format used for vectors. As with SVE we do not allow changes in the vector length during signal return but we do allow ZA to be enabled or disabled. Signed-off-by: Mark Brown --- arch/arm64/include/uapi/asm/sigcontext.h | 41 +++++++ arch/arm64/kernel/fpsimd.c | 3 - arch/arm64/kernel/signal.c | 139 +++++++++++++++++++++++ 3 files changed, 180 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h index 3a3366d4fbc2..d45bdf2c8b26 100644 --- a/arch/arm64/include/uapi/asm/sigcontext.h +++ b/arch/arm64/include/uapi/asm/sigcontext.h @@ -140,6 +140,14 @@ struct sve_context { #define SVE_SIG_FLAG_SM 0x1 /* Context describes streaming mode */ +#define ZA_MAGIC 0x54366345 + +struct za_context { + struct _aarch64_ctx head; + __u16 vl; + __u16 __reserved[3]; +}; + #endif /* !__ASSEMBLY__ */ #include @@ -259,4 +267,37 @@ struct sve_context { #define SVE_SIG_CONTEXT_SIZE(vq) \ (SVE_SIG_REGS_OFFSET + SVE_SIG_REGS_SIZE(vq)) +/* + * If the ZA register is enabled for the thread at signal delivery then, + * za_context.head.size >= ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl)) + * and the register data may be accessed using the ZA_SIG_*() macros. + * + * If za_context.head.size < ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl)) + * then ZA was not enabled and no register data was included in which case + * ZA register was not enabled for the thread and no register data + * the ZA_SIG_*() macros should not be used except for this check. + * + * The same convention applies when returning from a signal: a caller + * will need to remove or resize the za_context block if it wants to + * enable the ZA register when it was previously non-live or vice-versa. + * This may require the caller to allocate fresh memory and/or move other + * context blocks in the signal frame. + * + * Changing the vector length during signal return is not permitted: + * za_context.vl must equal the thread's current SME vector length when + * doing a sigreturn. + */ + +#define ZA_SIG_REGS_OFFSET \ + ((sizeof(struct za_context) + (__SVE_VQ_BYTES - 1)) \ + / __SVE_VQ_BYTES * __SVE_VQ_BYTES) + +#define ZA_SIG_REGS_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES)) + +#define ZA_SIG_ZAV_OFFSET(vq, n) (ZA_SIG_REGS_OFFSET + \ + (SVE_SIG_ZREG_SIZE(vq) * n)) + +#define ZA_SIG_CONTEXT_SIZE(vq) \ + (ZA_SIG_REGS_OFFSET + ZA_SIG_REGS_SIZE(vq)) + #endif /* _UAPI__ASM_SIGCONTEXT_H */ diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 33f80512753d..2242c14a5a05 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1190,9 +1190,6 @@ void fpsimd_release_task(struct task_struct *dead_task) #ifdef CONFIG_ARM64_SME -/* This will move to uapi/asm/sigcontext.h when signals are implemented */ -#define ZA_SIG_REGS_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES)) - /* * Ensure that task->thread.za_state is allocated and sufficiently large. * diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 5e6ec6ed8a1c..8efa78765cad 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -57,6 +57,7 @@ struct rt_sigframe_user_layout { unsigned long fpsimd_offset; unsigned long esr_offset; unsigned long sve_offset; + unsigned long za_offset; unsigned long extra_offset; unsigned long end_offset; }; @@ -219,6 +220,7 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx) struct user_ctxs { struct fpsimd_context __user *fpsimd; struct sve_context __user *sve; + struct za_context __user *za; }; #ifdef CONFIG_ARM64_SVE @@ -347,6 +349,101 @@ extern int restore_sve_fpsimd_context(struct user_ctxs *user); #endif /* ! CONFIG_ARM64_SVE */ +#ifdef CONFIG_ARM64_SME + +static int preserve_za_context(struct za_context __user *ctx) +{ + int err = 0; + u16 reserved[ARRAY_SIZE(ctx->__reserved)]; + unsigned int vl = task_get_sme_vl(current); + unsigned int vq; + + if (thread_za_enabled(¤t->thread)) + vq = sve_vq_from_vl(vl); + else + vq = 0; + + memset(reserved, 0, sizeof(reserved)); + + __put_user_error(ZA_MAGIC, &ctx->head.magic, err); + __put_user_error(round_up(SVE_SIG_CONTEXT_SIZE(vq), 16), + &ctx->head.size, err); + __put_user_error(vl, &ctx->vl, err); + BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved)); + err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved)); + + if (vq) { + /* + * This assumes that the ZA state has already been saved to + * the task struct by calling the function + * fpsimd_signal_preserve_current_state(). + */ + err |= __copy_to_user((char __user *)ctx + ZA_SIG_REGS_OFFSET, + current->thread.za_state, + ZA_SIG_REGS_SIZE(vq)); + } + + return err ? -EFAULT : 0; +} + +static int restore_za_context(struct user_ctxs __user *user) +{ + int err; + unsigned int vq; + struct za_context za; + + if (__copy_from_user(&za, user->za, sizeof(za))) + return -EFAULT; + + if (za.vl != task_get_sme_vl(current)) + return -EINVAL; + + if (za.head.size <= sizeof(*user->za)) { + current->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK; + return 0; + } + + vq = sve_vq_from_vl(za.vl); + + if (za.head.size < ZA_SIG_CONTEXT_SIZE(vq)) + return -EINVAL; + + /* + * Careful: we are about __copy_from_user() directly into + * thread.za_state with preemption enabled, so protection is + * needed to prevent a racing context switch from writing stale + * registers back over the new data. + */ + + fpsimd_flush_task_state(current); + /* From now, fpsimd_thread_switch() won't touch thread.sve_state */ + + sme_alloc(current); + if (!current->thread.za_state) { + current->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK; + clear_thread_flag(TIF_SME); + return -ENOMEM; + } + + err = __copy_from_user(current->thread.za_state, + (char __user const *)user->za + + ZA_SIG_REGS_OFFSET, + ZA_SIG_REGS_SIZE(vq)); + if (err) + return -EFAULT; + + set_thread_flag(TIF_SME); + current->thread.svcr |= SYS_SVCR_EL0_ZA_MASK; + + return 0; +} +#else /* ! CONFIG_ARM64_SME */ + +/* Turn any non-optimised out attempts to use these into a link error: */ +extern int preserve_za_context(void __user *ctx); +extern int restore_za_context(struct user_ctxs *user); + +#endif /* ! CONFIG_ARM64_SME */ static int parse_user_sigframe(struct user_ctxs *user, struct rt_sigframe __user *sf) @@ -361,6 +458,7 @@ static int parse_user_sigframe(struct user_ctxs *user, user->fpsimd = NULL; user->sve = NULL; + user->za = NULL; if (!IS_ALIGNED((unsigned long)base, 16)) goto invalid; @@ -426,6 +524,19 @@ static int parse_user_sigframe(struct user_ctxs *user, user->sve = (struct sve_context __user *)head; break; + case ZA_MAGIC: + if (!system_supports_sme()) + goto invalid; + + if (user->za) + goto invalid; + + if (size < sizeof(*user->za)) + goto invalid; + + user->za = (struct za_context __user *)head; + break; + case EXTRA_MAGIC: if (have_extra_context) goto invalid; @@ -549,6 +660,9 @@ static int restore_sigframe(struct pt_regs *regs, } } + if (err == 0 && system_supports_sme() && user.za) + err = restore_za_context(&user); + return err; } @@ -633,6 +747,24 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user, return err; } + if (system_supports_sme()) { + unsigned int vl; + unsigned int vq = 0; + + if (add_all) + vl = sme_max_vl(); + else + vl = task_get_sme_vl(current); + + if (thread_za_enabled(¤t->thread)) + vq = sve_vq_from_vl(vl); + + err = sigframe_alloc(user, &user->za_offset, + ZA_SIG_CONTEXT_SIZE(vq)); + if (err) + return err; + } + return sigframe_alloc_end(user); } @@ -681,6 +813,13 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user, err |= preserve_sve_context(sve_ctx); } + /* ZA state if present */ + if (system_supports_sme() && err == 0 && user->za_offset) { + struct za_context __user *za_ctx = + apply_user_offset(user, user->za_offset); + err |= preserve_za_context(za_ctx); + } + if (err == 0 && user->extra_offset) { char __user *sfp = (char __user *)user->sigframe; char __user *userp = From patchwork Wed Jan 26 15:11:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBD6FC28CF5 for ; Wed, 26 Jan 2022 15:12:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242624AbiAZPM7 (ORCPT ); Wed, 26 Jan 2022 10:12:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242627AbiAZPM7 (ORCPT ); Wed, 26 Jan 2022 10:12:59 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08F14C06161C for ; Wed, 26 Jan 2022 07:12:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8FDB961376 for ; Wed, 26 Jan 2022 15:12:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92918C36AE7; Wed, 26 Jan 2022 15:12:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209978; bh=54fbHBhPsozwjp/3pATyx8OpVN6ucv7EKpZAVjJ2qAQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nuUGXubyhKW6+PJ67AbtdFHCzjaH3Rq1IgRRo+sin1h8zonU8CUDiW0GN4KSqVWp8 O/nopIZUH853X42kjdPL6aA+3mM9wXIf5HEt39UiPkL8o9YvsQLyZlMmOsFiKR4Usb YlDEZfaTY83Wrs5S6UFgLuvpGROVuGH/2Pw2dkhC45Y3wkwJbtpSmY73tBSxzjBcDP Lpdobnnp6gFjSNpWKUxRTAjEs/ChHMTCDbCPZtwvYEBpLkcnJ86KjaHZjgZQvFoL56 SV5AZ309bbaTMOd+6OuXOs3YfVovo5gKFbN5Rf2h8H2NktoNoIk1nWRBOX95Be+c+p bnrVTodDkFU/Q== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 21/40] arm64/sme: Implement ptrace support for streaming mode SVE registers Date: Wed, 26 Jan 2022 15:11:01 +0000 Message-Id: <20220126151120.3811248-22-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=14915; h=from:subject; bh=54fbHBhPsozwjp/3pATyx8OpVN6ucv7EKpZAVjJ2qAQ=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSHNBn52S0bIWVYmB4TcfeoCi0Q1cKp21rjNspq DD4Ok3uJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkhwAKCRAk1otyXVSH0I0wB/ 9uUTauT7I5O5l0vd5iCwdqWsPEawNBudb+rjhd2tjakoCOeLty8qZOTpjBgVOM/gqRTDYM8pNVvs5D qLyXmZm5tWsKCGf9zIla9mpi+J6abYuVaMaOsN7O3/1fKpUeWP+n+YjR+zNEm2jV2zeJ55+6DLbC5a OGxJ76RkWGxFamJVGvJHFrbx39UOHbBebjI/j7dYbYlcLZmWQV4RBGc3cU7HrOfvIxDB+ber8SxtFY WqkyZEeoFOKyWOuu/gQ5Ni2SUKPAYblW5KYexHXszmVPZ8pEJllgV96nTXO1DM7LWSZBjL3jUQ5Lpb DJTFz7paG/DclaRR//GFvMa2M+rUTW X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The streaming mode SVE registers are represented using the same data structures as for SVE but since the vector lengths supported and in use may not be the same as SVE we represent them with a new type NT_ARM_SSVE. Unfortunately we only have a single 16 bit reserved field available in the header so there is no space to fit the current and maximum vector length for both standard and streaming SVE mode without redefining the structure in a way the creates a complicatd and fragile ABI. Since FFR is not present in streaming mode it is read and written as zero. Setting NT_ARM_SSVE registers will put the task into streaming mode, similarly setting NT_ARM_SVE registers will exit it. Reads that do not correspond to the current mode of the task will return the header with no register data. For compatibility reasons on write setting no flag for the register type will be interpreted as setting SVE registers, though users can provide no register data as an alternative mechanism for doing so. Signed-off-by: Mark Brown --- arch/arm64/include/uapi/asm/ptrace.h | 13 +- arch/arm64/kernel/fpsimd.c | 21 ++- arch/arm64/kernel/ptrace.c | 212 +++++++++++++++++++++------ include/uapi/linux/elf.h | 1 + 4 files changed, 190 insertions(+), 57 deletions(-) diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 758ae984ff97..522b925a78c1 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -109,7 +109,7 @@ struct user_hwdebug_state { } dbg_regs[16]; }; -/* SVE/FP/SIMD state (NT_ARM_SVE) */ +/* SVE/FP/SIMD state (NT_ARM_SVE & NT_ARM_SSVE) */ struct user_sve_header { __u32 size; /* total meaningful regset content in bytes */ @@ -220,6 +220,7 @@ struct user_sve_header { (SVE_PT_SVE_PREG_OFFSET(vq, __SVE_NUM_PREGS) - \ SVE_PT_SVE_PREGS_OFFSET(vq)) +/* For streaming mode SVE (SSVE) FFR must be read and written as zero */ #define SVE_PT_SVE_FFR_OFFSET(vq) \ (SVE_PT_REGS_OFFSET + __SVE_FFR_OFFSET(vq)) @@ -240,10 +241,12 @@ struct user_sve_header { - SVE_PT_SVE_OFFSET + (__SVE_VQ_BYTES - 1)) \ / __SVE_VQ_BYTES * __SVE_VQ_BYTES) -#define SVE_PT_SIZE(vq, flags) \ - (((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \ - SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \ - : SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags)) +#define SVE_PT_SIZE(vq, flags) \ + (((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \ + SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \ + : ((((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD ? \ + SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags) \ + : SVE_PT_REGS_OFFSET))) /* pointer authentication masks (NT_ARM_PAC_MASK) */ diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 2242c14a5a05..8b111b7f2006 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -645,14 +645,19 @@ static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst, */ static void fpsimd_to_sve(struct task_struct *task) { - unsigned int vq; + unsigned int vq, vl; void *sst = task->thread.sve_state; struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state; if (!system_supports_sve()) return; - vq = sve_vq_from_vl(task_get_sve_vl(task)); + if (thread_sm_enabled(&task->thread)) + vl = task_get_sme_vl(task); + else + vl = task_get_sve_vl(task); + + vq = sve_vq_from_vl(vl); __fpsimd_to_sve(sst, fst, vq); } @@ -669,7 +674,7 @@ static void fpsimd_to_sve(struct task_struct *task) */ static void sve_to_fpsimd(struct task_struct *task) { - unsigned int vq; + unsigned int vq, vl; void const *sst = task->thread.sve_state; struct user_fpsimd_state *fst = &task->thread.uw.fpsimd_state; unsigned int i; @@ -678,7 +683,12 @@ static void sve_to_fpsimd(struct task_struct *task) if (!system_supports_sve()) return; - vq = sve_vq_from_vl(task_get_sve_vl(task)); + if (thread_sm_enabled(&task->thread)) + vl = task_get_sme_vl(task); + else + vl = task_get_sve_vl(task); + + vq = sve_vq_from_vl(vl); for (i = 0; i < SVE_NUM_ZREGS; ++i) { p = (__uint128_t const *)ZREG(sst, vq, i); fst->vregs[i] = arm64_le128_to_cpu(*p); @@ -819,8 +829,7 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type, /* * To ensure the FPSIMD bits of the SVE vector registers are preserved, * write any live register state back to task_struct, and convert to a - * regular FPSIMD thread. Since the vector length can only be changed - * with a syscall we can't be in streaming mode while reconfiguring. + * regular FPSIMD thread. */ if (task == current) { get_cpu_fpsimd_context(); diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 39dbdfdc38d3..0b8324fa1abc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -714,21 +714,51 @@ static int system_call_set(struct task_struct *target, #ifdef CONFIG_ARM64_SVE static void sve_init_header_from_task(struct user_sve_header *header, - struct task_struct *target) + struct task_struct *target, + enum vec_type type) { unsigned int vq; + bool active; + bool fpsimd_only; + enum vec_type task_type; memset(header, 0, sizeof(*header)); - header->flags = test_tsk_thread_flag(target, TIF_SVE) ? - SVE_PT_REGS_SVE : SVE_PT_REGS_FPSIMD; - if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT)) - header->flags |= SVE_PT_VL_INHERIT; + /* Check if the requested registers are active for the task */ + if (thread_sm_enabled(&target->thread)) + task_type = ARM64_VEC_SME; + else + task_type = ARM64_VEC_SVE; + active = (task_type == type); + + switch (type) { + case ARM64_VEC_SVE: + if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT)) + header->flags |= SVE_PT_VL_INHERIT; + fpsimd_only = !test_tsk_thread_flag(target, TIF_SVE); + break; + case ARM64_VEC_SME: + if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT)) + header->flags |= SVE_PT_VL_INHERIT; + fpsimd_only = false; + break; + default: + WARN_ON_ONCE(1); + return; + } - header->vl = task_get_sve_vl(target); + if (active) { + if (fpsimd_only) { + header->flags |= SVE_PT_REGS_FPSIMD; + } else { + header->flags |= SVE_PT_REGS_SVE; + } + } + + header->vl = task_get_vl(target, type); vq = sve_vq_from_vl(header->vl); - header->max_vl = sve_max_vl(); + header->max_vl = vec_max_vl(type); header->size = SVE_PT_SIZE(vq, header->flags); header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl), SVE_PT_REGS_SVE); @@ -739,19 +769,17 @@ static unsigned int sve_size_from_header(struct user_sve_header const *header) return ALIGN(header->size, SVE_VQ_BYTES); } -static int sve_get(struct task_struct *target, - const struct user_regset *regset, - struct membuf to) +static int sve_get_common(struct task_struct *target, + const struct user_regset *regset, + struct membuf to, + enum vec_type type) { struct user_sve_header header; unsigned int vq; unsigned long start, end; - if (!system_supports_sve()) - return -EINVAL; - /* Header */ - sve_init_header_from_task(&header, target); + sve_init_header_from_task(&header, target, type); vq = sve_vq_from_vl(header.vl); membuf_write(&to, &header, sizeof(header)); @@ -759,49 +787,61 @@ static int sve_get(struct task_struct *target, if (target == current) fpsimd_preserve_current_state(); - /* Registers: FPSIMD-only case */ - BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header)); - if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD) + BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header)); + + switch ((header.flags & SVE_PT_REGS_MASK)) { + case SVE_PT_REGS_FPSIMD: return __fpr_get(target, regset, to); - /* Otherwise: full SVE case */ + case SVE_PT_REGS_SVE: + start = SVE_PT_SVE_OFFSET; + end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq); + membuf_write(&to, target->thread.sve_state, end - start); - BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header)); - start = SVE_PT_SVE_OFFSET; - end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq); - membuf_write(&to, target->thread.sve_state, end - start); + start = end; + end = SVE_PT_SVE_FPSR_OFFSET(vq); + membuf_zero(&to, end - start); - start = end; - end = SVE_PT_SVE_FPSR_OFFSET(vq); - membuf_zero(&to, end - start); + /* + * Copy fpsr, and fpcr which must follow contiguously in + * struct fpsimd_state: + */ + start = end; + end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE; + membuf_write(&to, &target->thread.uw.fpsimd_state.fpsr, + end - start); - /* - * Copy fpsr, and fpcr which must follow contiguously in - * struct fpsimd_state: - */ - start = end; - end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE; - membuf_write(&to, &target->thread.uw.fpsimd_state.fpsr, end - start); + start = end; + end = sve_size_from_header(&header); + return membuf_zero(&to, end - start); - start = end; - end = sve_size_from_header(&header); - return membuf_zero(&to, end - start); + default: + return 0; + } } -static int sve_set(struct task_struct *target, +static int sve_get(struct task_struct *target, const struct user_regset *regset, - unsigned int pos, unsigned int count, - const void *kbuf, const void __user *ubuf) + struct membuf to) +{ + if (!system_supports_sve()) + return -EINVAL; + + return sve_get_common(target, regset, to, ARM64_VEC_SVE); +} + +static int sve_set_common(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf, + enum vec_type type) { int ret; struct user_sve_header header; unsigned int vq; unsigned long start, end; - if (!system_supports_sve()) - return -EINVAL; - /* Header */ if (count < sizeof(header)) return -EINVAL; @@ -814,13 +854,37 @@ static int sve_set(struct task_struct *target, * Apart from SVE_PT_REGS_MASK, all SVE_PT_* flags are consumed by * vec_set_vector_length(), which will also validate them for us: */ - ret = vec_set_vector_length(target, ARM64_VEC_SVE, header.vl, + ret = vec_set_vector_length(target, type, header.vl, ((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16); if (ret) goto out; /* Actual VL set may be less than the user asked for: */ - vq = sve_vq_from_vl(task_get_sve_vl(target)); + vq = sve_vq_from_vl(task_get_vl(target, type)); + + /* Enter/exit streaming mode */ + if (system_supports_sme()) { + u64 old_svcr = target->thread.svcr; + + switch (type) { + case ARM64_VEC_SVE: + target->thread.svcr &= ~SYS_SVCR_EL0_SM_MASK; + break; + case ARM64_VEC_SME: + target->thread.svcr |= SYS_SVCR_EL0_SM_MASK; + break; + default: + WARN_ON_ONCE(1); + return -EINVAL; + } + + /* + * If we switched then invalidate any existing SVE + * state and ensure there's storage. + */ + if (target->thread.svcr != old_svcr) + sve_alloc(target); + } /* Registers: FPSIMD-only case */ @@ -832,7 +896,10 @@ static int sve_set(struct task_struct *target, goto out; } - /* Otherwise: full SVE case */ + /* + * Otherwise: no registers or full SVE case. For backwards + * compatibility reasons we treat empty flags as SVE registers. + */ /* * If setting a different VL from the requested VL and there is @@ -853,8 +920,9 @@ static int sve_set(struct task_struct *target, /* * Ensure target->thread.sve_state is up to date with target's - * FPSIMD regs, so that a short copyin leaves trailing registers - * unmodified. + * FPSIMD regs, so that a short copyin leaves trailing + * registers unmodified. Always enable SVE even if going into + * streaming mode. */ fpsimd_sync_to_sve(target); set_tsk_thread_flag(target, TIF_SVE); @@ -890,8 +958,46 @@ static int sve_set(struct task_struct *target, return ret; } +static int sve_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + if (!system_supports_sve()) + return -EINVAL; + + return sve_set_common(target, regset, pos, count, kbuf, ubuf, + ARM64_VEC_SVE); +} + #endif /* CONFIG_ARM64_SVE */ +#ifdef CONFIG_ARM64_SME + +static int ssve_get(struct task_struct *target, + const struct user_regset *regset, + struct membuf to) +{ + if (!system_supports_sme()) + return -EINVAL; + + return sve_get_common(target, regset, to, ARM64_VEC_SME); +} + +static int ssve_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + if (!system_supports_sme()) + return -EINVAL; + + return sve_set_common(target, regset, pos, count, kbuf, ubuf, + ARM64_VEC_SME); +} + +#endif /* CONFIG_ARM64_SME */ + #ifdef CONFIG_ARM64_PTR_AUTH static int pac_mask_get(struct task_struct *target, const struct user_regset *regset, @@ -1109,6 +1215,9 @@ enum aarch64_regset { #ifdef CONFIG_ARM64_SVE REGSET_SVE, #endif +#ifdef CONFIG_ARM64_SVE + REGSET_SSVE, +#endif #ifdef CONFIG_ARM64_PTR_AUTH REGSET_PAC_MASK, REGSET_PAC_ENABLED_KEYS, @@ -1189,6 +1298,17 @@ static const struct user_regset aarch64_regsets[] = { .set = sve_set, }, #endif +#ifdef CONFIG_ARM64_SME + [REGSET_SSVE] = { /* Streaming mode SVE */ + .core_note_type = NT_ARM_SSVE, + .n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE), + SVE_VQ_BYTES), + .size = SVE_VQ_BYTES, + .align = SVE_VQ_BYTES, + .regset_get = ssve_get, + .set = ssve_set, + }, +#endif #ifdef CONFIG_ARM64_PTR_AUTH [REGSET_PAC_MASK] = { .core_note_type = NT_ARM_PAC_MASK, diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 61bf4774b8f2..61502388683f 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -427,6 +427,7 @@ typedef struct elf64_shdr { #define NT_ARM_PACG_KEYS 0x408 /* ARM pointer authentication generic key */ #define NT_ARM_TAGGED_ADDR_CTRL 0x409 /* arm64 tagged address control (prctl()) */ #define NT_ARM_PAC_ENABLED_KEYS 0x40a /* arm64 ptr auth enabled keys (prctl()) */ +#define NT_ARM_SSVE 0x40b /* ARM Streaming SVE registers */ #define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */ #define NT_VMCOREDD 0x700 /* Vmcore Device Dump Note */ #define NT_MIPS_DSP 0x800 /* MIPS DSP ASE registers */ From patchwork Wed Jan 26 15:11:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B34EBC3526D for ; Wed, 26 Jan 2022 15:13:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235785AbiAZPNF (ORCPT ); Wed, 26 Jan 2022 10:13:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235709AbiAZPNE (ORCPT ); Wed, 26 Jan 2022 10:13:04 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70A59C06161C for ; Wed, 26 Jan 2022 07:13:04 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 10187B81CA7 for ; Wed, 26 Jan 2022 15:13:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78312C36AE9; Wed, 26 Jan 2022 15:12:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209981; bh=7yIiqkcfbJp87t6cPR9WNAAcTxL4NrY+L/kW3l1UCeY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=V34CT8kL1RrNtuoQC8QzH1a4CS9n5oEM9JwgZC/+RBuiCpakY7ErmpgoxvVfISDVi sOJ6f0At1KzeWYnbf4glKPKAB5XqnhfbO8b6XC6LYugKttJ+tO/44Udj91xHGLiLMU iBFvpXtYEJ3xXVPRRWBS8TYDu5GEcPAz17QTdTJdisyK062WLRHqtPrDiuVOvDmHSc tKBxGVj0rXBx/vRB8VPi5mWtqtfEbCs5umkAE/Q/E/4hr0QttGiLahzZ459tbnG51T f2THa/SUuY892J645Q2zHcvheCz4ra7UPV8Hy7tWM5g3c2RruDSKhn0gAKoCb3QNj9 ho2zDkHCpi6lw== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 22/40] arm64/sme: Add ptrace support for ZA Date: Wed, 26 Jan 2022 15:11:02 +0000 Message-Id: <20220126151120.3811248-23-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=8277; h=from:subject; bh=7yIiqkcfbJp87t6cPR9WNAAcTxL4NrY+L/kW3l1UCeY=; b=owGbwMvMwMWocq27KDak/QLjabUkhsSPKR168j1Huba3xr5k4tOfJ6jFH6N0r94rzdijqdX6kn+Q wJFORmMWBkYuBlkxRZa1zzJWpYdLbJ3/aP4rmEGsTCBTGLg4BWAibCzsv1krVgsUb9gUuo0jUO6+aF Kpg6zJ1b3tPzc1JU9xOsJbtnjXs/77rN7SfjdLHb0zg1z/Ct8UP5/lcUBMQuv7i2z22ZuXKkVslb+m VcqZG1R8KnflwxP5D8/4BV4LkdTZpXx6nyNrUrbT+YoDqjOn+PnVbnnH615UdVoypvKWHff8KS8zap PuTVb87bXL3epNHoO3gtCHdV2LMvn+hJdMibUpjUpmzi5SSJi4p3dbVkz6nvligTWd7v6XJ87fFp3B Uy9wIvbX3fszDG2uBM0tmHI0T6awL1vtYMKpqXP3XTVi/zVrv3/pNP16Je9rq56yaK8sO9c5/VadEc /twzs43b/8yGEVXXVA1PuhO78sAA== X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The ZA array can be read and written with the NT_ARM_ZA. Similarly to our interface for the SVE vector registers the regset consists of a header with information on the current vector length followed by an optional register data payload, represented as for signals as a series of horizontal vectors from 0 to VL/8 in the endianness independent format used for vectors. On get if ZA is enabled then register data will be provided, otherwise it will be omitted. On set if register data is provided then ZA is enabled and initialized using the provided data, otherwise it is disabled. Signed-off-by: Mark Brown --- arch/arm64/include/uapi/asm/ptrace.h | 56 +++++++++++ arch/arm64/kernel/ptrace.c | 144 +++++++++++++++++++++++++++ include/uapi/linux/elf.h | 1 + 3 files changed, 201 insertions(+) diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 522b925a78c1..7fa2f7036aa7 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -268,6 +268,62 @@ struct user_pac_generic_keys { __uint128_t apgakey; }; +/* ZA state (NT_ARM_ZA) */ + +struct user_za_header { + __u32 size; /* total meaningful regset content in bytes */ + __u32 max_size; /* maxmium possible size for this thread */ + __u16 vl; /* current vector length */ + __u16 max_vl; /* maximum possible vector length */ + __u16 flags; + __u16 __reserved; +}; + +/* + * Common ZA_PT_* flags: + * These must be kept in sync with prctl interface in + */ +#define ZA_PT_VL_INHERIT ((1 << 17) /* PR_SME_VL_INHERIT */ >> 16) +#define ZA_PT_VL_ONEXEC ((1 << 18) /* PR_SME_SET_VL_ONEXEC */ >> 16) + + +/* + * The remainder of the ZA state follows struct user_za_header. The + * total size of the ZA state (including header) depends on the + * metadata in the header: ZA_PT_SIZE(vq, flags) gives the total size + * of the state in bytes, including the header. + * + * Refer to for details of how to pass the correct + * "vq" argument to these macros. + */ + +/* Offset from the start of struct user_za_header to the register data */ +#define ZA_PT_ZA_OFFSET \ + ((sizeof(struct user_za_header) + (__SVE_VQ_BYTES - 1)) \ + / __SVE_VQ_BYTES * __SVE_VQ_BYTES) + +/* + * The payload starts at offset ZA_PT_ZA_OFFSET, and is of size + * ZA_PT_ZA_SIZE(vq, flags). + * + * The ZA array is stored as a sequence of horizontal vectors ZAV of SVL/8 + * bytes each, starting from vector 0. + * + * Additional data might be appended in the future. + * + * The ZA matrix is represented in memory in an endianness-invariant layout + * which differs from the layout used for the FPSIMD V-registers on big-endian + * systems: see sigcontext.h for more explanation. + */ + +#define ZA_PT_ZAV_OFFSET(vq, n) \ + (ZA_PT_ZA_OFFSET + ((vq * __SVE_VQ_BYTES) * n)) + +#define ZA_PT_ZA_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES)) + +#define ZA_PT_SIZE(vq) \ + (ZA_PT_ZA_OFFSET + ZA_PT_ZA_SIZE(vq)) + #endif /* __ASSEMBLY__ */ #endif /* _UAPI__ASM_PTRACE_H */ diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 0b8324fa1abc..d6f27d3bf6e7 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -996,6 +996,141 @@ static int ssve_set(struct task_struct *target, ARM64_VEC_SME); } +static int za_get(struct task_struct *target, + const struct user_regset *regset, + struct membuf to) +{ + struct user_za_header header; + unsigned int vq; + unsigned long start, end; + + if (!system_supports_sme()) + return -EINVAL; + + /* Header */ + memset(&header, 0, sizeof(header)); + + if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT)) + header.flags |= ZA_PT_VL_INHERIT; + + header.vl = task_get_sme_vl(target); + vq = sve_vq_from_vl(header.vl); + header.max_vl = sme_max_vl(); + header.max_size = ZA_PT_SIZE(vq); + + /* If ZA is not active there is only the header */ + if (thread_za_enabled(&target->thread)) + header.size = ZA_PT_SIZE(vq); + else + header.size = ZA_PT_ZA_OFFSET; + + membuf_write(&to, &header, sizeof(header)); + + BUILD_BUG_ON(ZA_PT_ZA_OFFSET != sizeof(header)); + end = ZA_PT_ZA_OFFSET; +; + if (target == current) + fpsimd_preserve_current_state(); + + /* Any register data to include? */ + if (thread_za_enabled(&target->thread)) { + start = end; + end = ZA_PT_SIZE(vq); + membuf_write(&to, target->thread.za_state, end - start); + } + + /* Zero any trailing padding */ + start = end; + end = ALIGN(header.size, SVE_VQ_BYTES); + return membuf_zero(&to, end - start); +} + +static int za_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + int ret; + struct user_za_header header; + unsigned int vq; + unsigned long start, end; + + if (!system_supports_sme()) + return -EINVAL; + + /* Header */ + if (count < sizeof(header)) + return -EINVAL; + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &header, + 0, sizeof(header)); + if (ret) + goto out; + + /* + * All current ZA_PT_* flags are consumed by + * vec_set_vector_length(), which will also validate them for + * us: + */ + ret = vec_set_vector_length(target, ARM64_VEC_SME, header.vl, + ((unsigned long)header.flags) << 16); + if (ret) + goto out; + + /* Actual VL set may be less than the user asked for: */ + vq = sve_vq_from_vl(task_get_sme_vl(target)); + + /* Ensure there is some SVE storage for streaming mode */ + if (!target->thread.sve_state) { + sve_alloc(target); + if (!target->thread.sve_state) { + clear_thread_flag(TIF_SME); + ret = -ENOMEM; + goto out; + } + } + + /* Allocate/reinit ZA storage */ + sme_alloc(target); + if (!target->thread.za_state) { + ret = -ENOMEM; + clear_tsk_thread_flag(target, TIF_SME); + goto out; + } + + /* If there is no data then disable ZA */ + if (!count) { + target->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK; + goto out; + } + + /* + * If setting a different VL from the requested VL and there is + * register data, the data layout will be wrong: don't even + * try to set the registers in this case. + */ + if (vq != sve_vq_from_vl(header.vl)) { + ret = -EIO; + goto out; + } + + BUILD_BUG_ON(ZA_PT_ZA_OFFSET != sizeof(header)); + start = ZA_PT_ZA_OFFSET; + end = ZA_PT_SIZE(vq); + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, + target->thread.za_state, + start, end); + if (ret) + goto out; + + /* Mark ZA as active and let userspace use it */ + set_tsk_thread_flag(target, TIF_SME); + target->thread.svcr |= SYS_SVCR_EL0_ZA_MASK; + +out: + fpsimd_flush_task_state(target); + return ret; +} + #endif /* CONFIG_ARM64_SME */ #ifdef CONFIG_ARM64_PTR_AUTH @@ -1217,6 +1352,7 @@ enum aarch64_regset { #endif #ifdef CONFIG_ARM64_SVE REGSET_SSVE, + REGSET_ZA, #endif #ifdef CONFIG_ARM64_PTR_AUTH REGSET_PAC_MASK, @@ -1308,6 +1444,14 @@ static const struct user_regset aarch64_regsets[] = { .regset_get = ssve_get, .set = ssve_set, }, + [REGSET_ZA] = { /* SME ZA */ + .core_note_type = NT_ARM_ZA, + .n = DIV_ROUND_UP(ZA_PT_ZA_SIZE(SVE_VQ_MAX), SVE_VQ_BYTES), + .size = SVE_VQ_BYTES, + .align = SVE_VQ_BYTES, + .regset_get = za_get, + .set = za_set, + }, #endif #ifdef CONFIG_ARM64_PTR_AUTH [REGSET_PAC_MASK] = { diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 61502388683f..7ef574f3256a 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -428,6 +428,7 @@ typedef struct elf64_shdr { #define NT_ARM_TAGGED_ADDR_CTRL 0x409 /* arm64 tagged address control (prctl()) */ #define NT_ARM_PAC_ENABLED_KEYS 0x40a /* arm64 ptr auth enabled keys (prctl()) */ #define NT_ARM_SSVE 0x40b /* ARM Streaming SVE registers */ +#define NT_ARM_ZA 0x40c /* ARM SME ZA registers */ #define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */ #define NT_VMCOREDD 0x700 /* Vmcore Device Dump Note */ #define NT_MIPS_DSP 0x800 /* MIPS DSP ASE registers */ From patchwork Wed Jan 26 15:11:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B194C28CF5 for ; Wed, 26 Jan 2022 15:13:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242606AbiAZPNH (ORCPT ); Wed, 26 Jan 2022 10:13:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235709AbiAZPNG (ORCPT ); Wed, 26 Jan 2022 10:13:06 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5BA5C06161C for ; Wed, 26 Jan 2022 07:13:06 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 553F861857 for ; Wed, 26 Jan 2022 15:13:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C08FC36AE5; Wed, 26 Jan 2022 15:13:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209985; bh=+stvVyRnz2H4xbnaEtKWfcp0/IPDCeyODbC6Y5cXcCY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nB72mlesq+Au/VqjtZb8aIX4mDOkielDaAjw9jlCVRX4Y0e8JCVzCtEgJ/IWESeL5 R/4Fvt/k35LkgXsSiMk80sJoC0vJ8mjgqtdH1ceuU2Bu22LYsYLUqz2hLGExeJ+0Mq Md3W6MEQqb/yCGQBVBtCTDFEVQzT0uA15vB5I+zucm0NX2xIAJRmqlpX+WSNXVYOPo wFHOr3jgIqL3+Fzp/PIr0Q41r5mLdKeL3C6CbPGba+IuSCMa8Dk3U7yvREwWv8BNRp +lIcv7EKToZH+Zr4C/jqR87Wrg3NY6JR7EikQ12tJSU8tY65c/o8/k8j501PP/7iB+ jNpTkbksm2PKg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 23/40] arm64/sme: Disable streaming mode and ZA when flushing CPU state Date: Wed, 26 Jan 2022 15:11:03 +0000 Message-Id: <20220126151120.3811248-24-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1128; h=from:subject; bh=+stvVyRnz2H4xbnaEtKWfcp0/IPDCeyODbC6Y5cXcCY=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSIZjkrllyw3wrZv65gkzLMq7K6hvi8wR5NerZM dVGlefyJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkiAAKCRAk1otyXVSH0Ny2B/ 4nCSiDDbNsZAJkVTSzqfPPSmIOHrE3u6RKhTUbQEZrRIgsnTYyUlPDxQUlOqqRMxUQMw+c2UW0bAGA pX+g4HA+xh5C84zTK7gR/yd9LKSq9g2eIpGPmUkJGMtzp03CVRNlsP0ZxgnYaKG7WhFiwKDaKSiKw2 fQlQ5r2k7SDzi4LV7FJ9vxFaV2H9ypVvBxsxhs8O45V24Ko0akf0RnjUlUFpzc833WWwgxK5LnBb0C P5NfIDUTo1JC6QdpLVZxp4Onahtaai1jx7UKYnJyHXjIjQO+wiMu5J2jIGJsJWyZcq/hYUjaABoFNQ UdFkkLsCbkmeoXmRIlpO2YmYWf8/TX X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Both streaming mode and ZA may increase power consumption when they are enabled and streaming mode makes many FPSIMD and SVE instructions undefined which will cause problems for any kernel mode floating point so disable both when we flush the CPU state. This covers both kernel_neon_begin() and idle and after flushing the state a reload is always required anyway. Signed-off-by: Mark Brown --- arch/arm64/kernel/fpsimd.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 8b111b7f2006..e00d3a9e919c 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1762,6 +1762,15 @@ static void fpsimd_flush_cpu_state(void) { WARN_ON(!system_supports_fpsimd()); __this_cpu_write(fpsimd_last_state.st, NULL); + + /* + * Leaving streaming mode enabled will cause issues for any kernel + * NEON and leaving streaming mode or ZA enabled may increase power + * consumption. + */ + if (system_supports_sme()) + sme_smstop(); + set_thread_flag(TIF_FOREIGN_FPSTATE); } From patchwork Wed Jan 26 15:11:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C910C28CF5 for ; Wed, 26 Jan 2022 15:13:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235709AbiAZPNM (ORCPT ); Wed, 26 Jan 2022 10:13:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235791AbiAZPNM (ORCPT ); Wed, 26 Jan 2022 10:13:12 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27410C06161C for ; Wed, 26 Jan 2022 07:13:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D94F5B81E71 for ; Wed, 26 Jan 2022 15:13:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3E9B7C36AE7; Wed, 26 Jan 2022 15:13:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209989; bh=9LPfnNjILhCHVOzA/ALZlaZgBfzu64ZFSV1uulmc3Ng=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rgi/d6kBOI52gpRmnESktU1dKyIDT4yMQyZv+CDysuJnKwigOcl5Hnu0F+gjsZ2sC +FvvPBUic+twOJ4oFqspW3m2FOK9Vp4maO4sPaAN5aew2ICztl0aNt3zpPKIUtREBx Xd5V6Jap6kMt4O1Ei2iJOYGh7AJwSFRy9uLhtjvQnsEW7zbnPeuu42sPWR7iq4qbBM 1YTfklr3Jhc7FSw9E176lbNAFg7U/xPBrOgojudOE2J3ush5sDh5gMqng9UjKx28H5 ALCmG0G0U9C27QPkIo+vRaCXjhIXy+XvJSK+qjwjJ+lxCY0pxSIZGAxoh1wAwR1isV TGMdfgoOijYDQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 24/40] arm64/sme: Save and restore streaming mode over EFI runtime calls Date: Wed, 26 Jan 2022 15:11:04 +0000 Message-Id: <20220126151120.3811248-25-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3623; h=from:subject; bh=9LPfnNjILhCHVOzA/ALZlaZgBfzu64ZFSV1uulmc3Ng=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSJ+1rMeb1w4VGXA4pCLW80Hj1ML4fkGBbtIMNG NJ0XdkiJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkiQAKCRAk1otyXVSH0GvhCA CA2HQJOmg20/ZgXWPzessQE+AMxQ+RsrIOt4TaHURYEB94KBQNRY+4ZbgTDjo7ZyGwvydCC9KFt8ln BscZneyG2rV/WvrfvV1InDIosV/4DhIrO36eb9pRKqXC5kABQ9pns32FoOi9kJCjmR+ZC2CBcFPvuM f9XL9Tx9chgmgKj82cFUNhn5OLaJfeUXXLguAFSQbnCvjTwxOSqgsKuA0qtsAaoP9kw9tMIk+fxmKW rdZauPCdF4fUxxdUoZmFLnql0fsEc9orpIl2IX2aR/vdA756rSCNdEFiExG/Vsxk0BZGNrfoJG0JoC iwYByeB6oZSkuStc+n3/yEx8uD6mqo X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When saving and restoring the floating point state over an EFI runtime call ensure that we handle streaming mode, only handling FFR if we are not in streaming mode and ensuring that we are in normal mode over the call into runtime services. We currently assume that ZA will not be modified by runtime services, the specification is not yet finalised so this may need updating if that changes. Signed-off-by: Mark Brown --- arch/arm64/kernel/fpsimd.c | 47 +++++++++++++++++++++++++++++++++----- 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index e00d3a9e919c..a7077a5d1ed2 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1059,21 +1059,25 @@ int vec_verify_vq_map(enum vec_type type) static void __init sve_efi_setup(void) { - struct vl_info *info = &vl_info[ARM64_VEC_SVE]; + int max_vl = 0; + int i; if (!IS_ENABLED(CONFIG_EFI)) return; + for (i = 0; i < ARRAY_SIZE(vl_info); i++) + max_vl = max(vl_info[i].max_vl, max_vl); + /* * alloc_percpu() warns and prints a backtrace if this goes wrong. * This is evidence of a crippled system and we are returning void, * so no attempt is made to handle this situation here. */ - if (!sve_vl_valid(info->max_vl)) + if (!sve_vl_valid(max_vl)) goto fail; efi_sve_state = __alloc_percpu( - SVE_SIG_REGS_SIZE(sve_vq_from_vl(info->max_vl)), SVE_VQ_BYTES); + SVE_SIG_REGS_SIZE(sve_vq_from_vl(max_vl)), SVE_VQ_BYTES); if (!efi_sve_state) goto fail; @@ -1848,6 +1852,7 @@ EXPORT_SYMBOL(kernel_neon_end); static DEFINE_PER_CPU(struct user_fpsimd_state, efi_fpsimd_state); static DEFINE_PER_CPU(bool, efi_fpsimd_state_used); static DEFINE_PER_CPU(bool, efi_sve_state_used); +static DEFINE_PER_CPU(bool, efi_sm_state); /* * EFI runtime services support functions @@ -1882,12 +1887,28 @@ void __efi_fpsimd_begin(void) */ if (system_supports_sve() && likely(efi_sve_state)) { char *sve_state = this_cpu_ptr(efi_sve_state); + bool ffr = true; + u64 svcr; __this_cpu_write(efi_sve_state_used, true); + /* If we are in streaming mode don't touch FFR */ + if (system_supports_sme()) { + svcr = read_sysreg_s(SYS_SVCR_EL0); + + ffr = svcr & SYS_SVCR_EL0_SM_MASK; + + __this_cpu_write(efi_sm_state, ffr); + } + sve_save_state(sve_state + sve_ffr_offset(sve_max_vl()), &this_cpu_ptr(&efi_fpsimd_state)->fpsr, - true); + ffr); + + if (system_supports_sme()) + sysreg_clear_set_s(SYS_SVCR_EL0, + SYS_SVCR_EL0_SM_MASK, 0); + } else { fpsimd_save_state(this_cpu_ptr(&efi_fpsimd_state)); } @@ -1910,11 +1931,25 @@ void __efi_fpsimd_end(void) if (system_supports_sve() && likely(__this_cpu_read(efi_sve_state_used))) { char const *sve_state = this_cpu_ptr(efi_sve_state); + bool ffr = true; + + /* + * Restore streaming mode; EFI calls are + * normal function calls so should not return in + * streaming mode. + */ + if (system_supports_sme()) { + if (__this_cpu_read(efi_sm_state)) { + sysreg_clear_set_s(SYS_SVCR_EL0, + 0, + SYS_SVCR_EL0_SM_MASK); + ffr = false; + } + } - sve_set_vq(sve_vq_from_vl(sve_get_vl()) - 1); sve_load_state(sve_state + sve_ffr_offset(sve_max_vl()), &this_cpu_ptr(&efi_fpsimd_state)->fpsr, - true); + ffr); __this_cpu_write(efi_sve_state_used, false); } else { From patchwork Wed Jan 26 15:11:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 551A4C2BA4C for ; Wed, 26 Jan 2022 15:13:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242608AbiAZPNQ (ORCPT ); Wed, 26 Jan 2022 10:13:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235791AbiAZPNQ (ORCPT ); Wed, 26 Jan 2022 10:13:16 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDEE8C06161C for ; Wed, 26 Jan 2022 07:13:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B8395B81CA7 for ; Wed, 26 Jan 2022 15:13:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 260C6C340E6; Wed, 26 Jan 2022 15:13:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209993; bh=DtugFpthaiK0xEHP2NQTQkSOyUac+uhH3/vXGpUUR8E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Zc77pxZ1Az3YciBk2wwKQDpq4vBfj/RH2TYZ7DA3FshVdLc97E+QKutOYspwD8+j6 jKvLWRjpkSsAZAdZmhdvralm+d4tDPRMfcpvatd+w4uU92wQGdobAJWwtdGjmZ8Vqt O/32SK4VQrV6QFP2mcPYdgHx+EDuhHZ0z2Mf810o6nNZ41jgnhxTn2qz8NM8NbtZ7x vc1qZmTfmBgabk4sn+JqKF/u0KlM8aUzGGtXr0nUshOIxKZOTa8nrapksoZqKyJ1LW qCIhJCqbLYVWUT3BKG7fAVroh/TET/MBs72CJ8RsYLwbpqRzOSCZhOzRf3uXvrBA77 2sixuN2lHzV0w== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 25/40] KVM: arm64: Hide SME system registers from guests Date: Wed, 26 Jan 2022 15:11:05 +0000 Message-Id: <20220126151120.3811248-26-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2580; h=from:subject; bh=DtugFpthaiK0xEHP2NQTQkSOyUac+uhH3/vXGpUUR8E=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSKhVjgkrdByxSLKuspmwEp1Ss1KXA83wWthUlS vnIER7eJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkigAKCRAk1otyXVSH0DJaCA CGLW0BldusV/aXgDif7uqkawLEs38MDvLy8ThWfB8ScMlN2PNu78nyarrW6PvXcVEZwUdqAbbwj2W8 dSOnR8Dq+X4h47L7m2edeI8S8Ft/T030WVPmy8PRoVgAZfanGo3s1NX+FFzEL6elAkS6uxn2qDJtai mSa1M78BSLeIT8X+quu7iUaUzBZ+Qx7BKoWdSFRJh75RSxbZUs6UQZQUeKy7jLxxf276TDl/++RBOU 5dkLQvHbGIMt7ppHLYDy2AHTNdI9hlpBwrDhv17FaBstBPYbhdGzz9XlbMaGav9zhhfEwewPkbAlHA 3U6pk8tWhNuj0oDGBQaWJoOQXSa6EG X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org For the time being we do not support use of SME by KVM guests, support for this will be enabled in future. In order to prevent any side effects or side channels via the new system registers, including the EL0 read/write register TPIDR2, explicitly undefine all the system registers added by SME and mask out the SME bitfield in SYS_ID_AA64PFR1. Signed-off-by: Mark Brown --- arch/arm64/kvm/sys_regs.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 4dc2fba316ff..43516eda9143 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1089,6 +1089,8 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu, case SYS_ID_AA64PFR1_EL1: if (!kvm_has_mte(vcpu->kvm)) val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE); + + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_SME); break; case SYS_ID_AA64ISAR1_EL1: if (!vcpu_has_ptrauth(vcpu)) @@ -1508,7 +1510,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_UNALLOCATED(4,2), ID_UNALLOCATED(4,3), ID_SANITISED(ID_AA64ZFR0_EL1), - ID_UNALLOCATED(4,5), + ID_HIDDEN(ID_AA64SMFR0_EL1), ID_UNALLOCATED(4,6), ID_UNALLOCATED(4,7), @@ -1551,6 +1553,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility }, { SYS_DESC(SYS_TRFCR_EL1), undef_access }, + { SYS_DESC(SYS_SMPRI_EL1), undef_access }, + { SYS_DESC(SYS_SMCR_EL1), undef_access }, { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 }, { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 }, { SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 }, @@ -1633,8 +1637,10 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_CCSIDR_EL1), access_ccsidr }, { SYS_DESC(SYS_CLIDR_EL1), access_clidr }, + { SYS_DESC(SYS_SMIDR_EL1), undef_access }, { SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 }, { SYS_DESC(SYS_CTR_EL0), access_ctr }, + { SYS_DESC(SYS_SVCR_EL0), undef_access }, { PMU_SYS_REG(SYS_PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr, .reg = PMCR_EL0 }, @@ -1674,6 +1680,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 }, { SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 }, + { SYS_DESC(SYS_TPIDR2_EL0), undef_access }, { SYS_DESC(SYS_SCXTNUM_EL0), undef_access }, From patchwork Wed Jan 26 15:11:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537016 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35AD7C3526D for ; Wed, 26 Jan 2022 15:13:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242619AbiAZPNS (ORCPT ); Wed, 26 Jan 2022 10:13:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235791AbiAZPNS (ORCPT ); Wed, 26 Jan 2022 10:13:18 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7ECF5C06161C for ; Wed, 26 Jan 2022 07:13:18 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1098461463 for ; Wed, 26 Jan 2022 15:13:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09A96C340E3; Wed, 26 Jan 2022 15:13:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643209997; bh=zPY7MztNeoE37pwIcdD53A+8yVkZO7HD7I60oNurDIQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jrNBs0KRmioQSHV+3p1Bj1WvQ34d3K31ve/1tOselvfqbfqbihYAVCgWb6yYdFYUC PYhXo0+36xqzuPKvz8P60GxMJkVRI2O1L3lz5bBs2m+uZ4lATj17rvHwBauE4LMWov 4eYx72Rf5zeB9GrpXlUA1+Sododu4pqoR7aKRZCqbFCRqQ8IS67LC4mh937/Pi+rYE wp2hWlxS5/dbBVyG5ICyx+x1P4lWVkxoCiDaPXhOW1KDbrLl28d/ZrqTh/GQVrTY0b DMV/VVOA+QCkKRLgR1DE08rdmrTpSbAilCK2WnoeBtI0JYMJx2f641DfiXV24a/vRb dk7GjI7zXRy3g== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 26/40] KVM: arm64: Trap SME usage in guest Date: Wed, 26 Jan 2022 15:11:06 +0000 Message-Id: <20220126151120.3811248-27-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4157; h=from:subject; bh=zPY7MztNeoE37pwIcdD53A+8yVkZO7HD7I60oNurDIQ=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSLR0Rz8fI5m32styscC/lpTw1aQPp/qDGAgb66 dxZu8k2JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkiwAKCRAk1otyXVSH0G3rB/ 9uHjpghZVQAfBXZLhzt2XmJY/vePWk+y1yAdxb0d9mavaBolYBRRFxR2CjYQeb5i1C0a3cvyIL0Lu8 8jcyIGE2CIFU+c5UCV2T50UuY9NTxJFKSY+O68WuZTEP+P9fmLFHi8cBTXxWAW1sz/jyAmL1mchwQA B1cGEuM5n7Nu6TwZcV83Xw3wH6i6PZ+HMgJDLsmWf41Q4Cm/XxAI+ltkR7Gg1qE6UkRpAvG7fP+igM dBfLTodvwxzpIcg0JX3gQgVIMRj03+7xBxbCpTgc1D176lWsRsBOg1lKlucCdfN6dCCvMem44Zk0MR 5h2E2W3nkoykQlL2t59tlZyYZG3c+B X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org SME defines two new traps which need to be enabled for guests to ensure that they can't use SME, one for the main SME operations which mirrors the traps for SVE and another for access to TPIDR2 in SCTLR_EL2. For VHE manage SMEN along with ZEN in activate_traps() and the FP state management callbacks, along with SCTLR_EL2.EnTPIDR2. There is no existing dynamic management of SCTLR_EL2. For nVHE manage TSM in activate_traps() along with the fine grained traps for TPIDR2 and SMPRI. There is no existing dynamic management of fine grained traps. Signed-off-by: Mark Brown --- arch/arm64/kvm/hyp/nvhe/switch.c | 30 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/vhe/switch.c | 11 ++++++++++- 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 6410d21d8695..184bf6bd79b9 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -47,10 +47,25 @@ static void __activate_traps(struct kvm_vcpu *vcpu) val |= CPTR_EL2_TFP | CPTR_EL2_TZ; __activate_traps_fpsimd32(vcpu); } + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + val |= CPTR_EL2_TSM; write_sysreg(val, cptr_el2); write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el2); + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && + cpus_have_final_cap(ARM64_HAS_FGT)) { + val = read_sysreg_s(SYS_HFGRTR_EL2); + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | + HFGxTR_EL2_nSMPRI_EL1_MASK); + write_sysreg_s(val, SYS_HFGRTR_EL2); + + val = read_sysreg_s(SYS_HFGWTR_EL2); + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | + HFGxTR_EL2_nSMPRI_EL1_MASK); + write_sysreg_s(val, SYS_HFGWTR_EL2); + } + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; @@ -94,9 +109,24 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2); + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && + cpus_have_final_cap(ARM64_HAS_FGT)) { + u64 val; + + val = read_sysreg_s(SYS_HFGRTR_EL2); + val |= HFGxTR_EL2_nTPIDR_EL0_MASK | HFGxTR_EL2_nSMPRI_EL1_MASK; + write_sysreg_s(val, SYS_HFGRTR_EL2); + + val = read_sysreg_s(SYS_HFGWTR_EL2); + val |= HFGxTR_EL2_nTPIDR_EL0_MASK | HFGxTR_EL2_nSMPRI_EL1_MASK; + write_sysreg_s(val, SYS_HFGWTR_EL2); + } + cptr = CPTR_EL2_DEFAULT; if (vcpu_has_sve(vcpu) && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)) cptr |= CPTR_EL2_TZ; + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + cptr &= ~CPTR_EL2_TSM; write_sysreg(cptr, cptr_el2); write_sysreg(__kvm_hyp_host_vector, vbar_el2); diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 619353b06e38..21217485514d 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -38,7 +38,8 @@ static void __activate_traps(struct kvm_vcpu *vcpu) val = read_sysreg(cpacr_el1); val |= CPACR_EL1_TTA; - val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN); + val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN | + CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN); /* * With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to @@ -59,6 +60,10 @@ static void __activate_traps(struct kvm_vcpu *vcpu) __activate_traps_fpsimd32(vcpu); } + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + write_sysreg(read_sysreg(sctlr_el2) & ~SCTLR_ELx_ENTP2, + sctlr_el2); + write_sysreg(val, cpacr_el1); write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el1); @@ -80,6 +85,10 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) */ asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + write_sysreg(read_sysreg(sctlr_el2) | SCTLR_ELx_ENTP2, + sctlr_el2); + write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); write_sysreg(vectors, vbar_el1); } From patchwork Wed Jan 26 15:11:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F9B4C3526D for ; Wed, 26 Jan 2022 15:13:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235849AbiAZPNY (ORCPT ); Wed, 26 Jan 2022 10:13:24 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:41734 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235791AbiAZPNY (ORCPT ); Wed, 26 Jan 2022 10:13:24 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 64214B81C0F for ; Wed, 26 Jan 2022 15:13:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E2D5BC340ED; Wed, 26 Jan 2022 15:13:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210002; bh=jEyud6x1YGBVsWHdD8Q9gNLZk7/BvtyPAjeOVpneBR8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=s01ocj9O+vmUI+2bzrankFQh1SucDFgBAON8TER9p74KL709oSTETT1EDDlHj4utA liDB3mTOMJmL6uuT6KRVadzwuAR2InKa0wDp3u90PuuZu9TMf+ZkR+fuOwu068jTG6 qU4d/o5ZCW0Vr41MEk8o79iiylyVnUftuNOnTRdn5WrVwE8vi8ByKkQLu3D9CKkZ0U VrCQafXCzAkVUoiFGHEf1JqCnSWZ6VvTusjblX1QFceqErYgd8RgBJc4mkWSfkwjrv mCmB4Xfg2k/r0Qrg64EUArcHIz693PUMjmTn+RYZpA7f/Un3n2bXkoCLUrx0DS8juR yHkFy8wIrIKBg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 27/40] KVM: arm64: Handle SME host state when running guests Date: Wed, 26 Jan 2022 15:11:07 +0000 Message-Id: <20220126151120.3811248-28-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4080; h=from:subject; bh=jEyud6x1YGBVsWHdD8Q9gNLZk7/BvtyPAjeOVpneBR8=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSMfJFkLKCJZq4Gps4zY52BZXzbcGLbOMe99Cdt hhGcKniJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkjAAKCRAk1otyXVSH0Ar+B/ 9kFLtL6e0zg6qzXXo3LYZrphaqADH0ZH9VhO7D4RPPRG4H08TRg7jwNEwCZIlT3uzQQQU390/3TNFI 9jb5XL4xekP5q35vPJivFuudpYZkKCd/6vRI7+0pMknqgwZ9kUtOp2MapmVpr5DEEyfXIaamxPc0X8 DrulZ8WwepvmNQGkyBJdy0cwqAR7EDOkVQyhC67E2iCtr+vuLeAr73XcjSEq8dcV+l1dm/vOWFLaeu SXN3evr2Hm8HR9aAU1gHRNLBbslHRc2tFraYQca7oSYBRzbKJLQPI6Lvyvtoeck14prm2Y8L7AHGEJ 0SkdEOaQcnK3Ykg5F4ilSvI16djBPC X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org While we don't currently support SME in guests we do currently support it for the host system so we need to take care of SME's impact, including the floating point register state, when running guests. Simiarly to SVE we need to manage the traps in CPACR_RL1, what is new is the handling of streaming mode and ZA. Normally we defer any handling of the floating point register state until the guest first uses it however if the system is in streaming mode FPSIMD and SVE operations may generate SME traps which we would need to distinguish from actual attempts by the guest to use SME. Rather than do this for the time being if we are in streaming mode when entering the guest we force the floating point state to be saved immediately and exit streaming mode, meaning that the guest won't generate SME traps for supported operations. We could handle ZA in the access trap similarly to the FPSIMD/SVE state without the disruption caused by streaming mode but for simplicity handle it the same way as streaming mode for now. This will be revisited when we support SME for guests (hopefully before SME hardware becomes available), for now it will only incur additional cost on systems with SME and even there only if streaming mode or ZA are enabled. Signed-off-by: Mark Brown --- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/fpsimd.c | 38 +++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7dc85d5a6552..404b7358ba96 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -438,6 +438,7 @@ struct kvm_vcpu_arch { #define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context if active */ #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */ #define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14) +#define KVM_ARM64_HOST_SME_ENABLED (1 << 15) /* SME enabled for EL0 */ #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \ KVM_GUESTDBG_USE_SW_BP | \ diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 338733ac63f8..cecaddb644ce 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -82,6 +82,26 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN) vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED; + + /* + * We don't currently support SME guests but if we leave + * things in streaming mode then when the guest starts running + * FPSIMD or SVE code it may generate SME traps so as a + * special case if we are in streaming mode we force the host + * state to be saved now and exit streaming mode so that we + * don't have to handle any SME traps for valid guest + * operations. Do this for ZA as well for now for simplicity. + */ + if (system_supports_sme()) { + if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN) + vcpu->arch.flags |= KVM_ARM64_HOST_SME_ENABLED; + + if (read_sysreg_s(SYS_SVCR_EL0) & + (SYS_SVCR_EL0_SM_MASK | SYS_SVCR_EL0_ZA_MASK)) { + vcpu->arch.flags &= ~KVM_ARM64_FP_HOST; + fpsimd_save_and_flush_cpu_state(); + } + } } void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu) @@ -129,6 +149,24 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) local_irq_save(flags); + /* + * If we have VHE then the Hyp code will reset CPACR_EL1 to + * CPACR_EL1_DEFAULT and we need to reenable SME. + */ + if (has_vhe()) { + if (system_supports_sme()) { + /* Also restore EL0 state seen on entry */ + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) + sysreg_clear_set(CPACR_EL1, 0, + CPACR_EL1_SMEN_EL0EN | + CPACR_EL1_SMEN_EL1EN); + else + sysreg_clear_set(CPACR_EL1, + CPACR_EL1_SMEN_EL0EN, + CPACR_EL1_SMEN_EL1EN); + } + } + if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) { if (vcpu_has_sve(vcpu)) { __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR); From patchwork Wed Jan 26 15:11:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9D04C3526D for ; Wed, 26 Jan 2022 15:13:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235791AbiAZPNa (ORCPT ); Wed, 26 Jan 2022 10:13:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242625AbiAZPN3 (ORCPT ); Wed, 26 Jan 2022 10:13:29 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29D95C06161C for ; Wed, 26 Jan 2022 07:13:29 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id BB444617B4 for ; Wed, 26 Jan 2022 15:13:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0567C340E6; Wed, 26 Jan 2022 15:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210008; bh=QxDCYADgLO7Ate8OqAbdluQUXHUE34NK/8kXtjiy66A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JHtV8VJLvuGbs3WmZsViKNdU5o3APfzD3v1wH73TlEn8/WGXUf3BhVo7Xpcy+X07e alTFNfoYZZ7hV61YByc4IazNRcpeinW585zTp9KcJvZ9otLmWM7h8bYJtel7bOUdvK K0x392mH+wjsQVI/UR0gLNLZB7hodOTb4H5sXuPw/wPpm1k6fKoXksbXEwsSACpvot W4IFCPJE1R4ptCwm5sscfEXrTdbiP980ndvDefu2iSO7aQuf0W7+/hLHFnOyglVCc7 IJet6YCxtNp0KhwfJwF9ZQu5YObj+PMHU+3Z4ff5kF6dcfdljEh2Du/FRshHnbANN1 QZ0YMFcr9DNYg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 28/40] arm64/sme: Provide Kconfig for SME Date: Wed, 26 Jan 2022 15:11:08 +0000 Message-Id: <20220126151120.3811248-29-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1350; h=from:subject; bh=QxDCYADgLO7Ate8OqAbdluQUXHUE34NK/8kXtjiy66A=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSMO5EXNCblah8mNd3crJRe3Ue7ZJ3oWFjj+/Wo dCuJNvuJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkjAAKCRAk1otyXVSH0JqyB/ 0QTisbTvOcPNbemLX9/+A5acoUoqosDxXpmP+/QRizvr7vIQO/vILQS2jbt/6YdX4WQHnesEwRKpfR Ow+7cQgQPnEht6HgTXwQKEpUZ+hxEZLyMsUho51y+i9giTQYNWkbe4qpZLd4oRnNaxkY0CP+a/YCgg 81ETu0W84H7FDqxaeAFKQrY+BmCB2eiA2jR0qe4Z5/iVzR9My9xEsO4mPuTAdkP28mMqMRvlBpQt7F YvW+TXcIh/b5z/niRxjBIyBHPvpKsGV2ER4XsZ5EPk7gUazv+vVEvThGRjkgeaIL8ECtOI3KWx2Dtx DDpCIWHu4J3aaWTlstWKrVKXC8gUYZ X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Now that basline support for the Scalable Matrix Extension (SME) is present introduce the Kconfig option allowing it to be built. While the feature registers don't impose a strong requirement for a system with SME to support SVE at runtime the support for streaming mode SVE is mostly shared with normal SVE so depend on SVE. Signed-off-by: Mark Brown --- arch/arm64/Kconfig | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6978140edfa4..f60f3b04ddf5 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1839,6 +1839,17 @@ config ARM64_SVE booting the kernel. If unsure and you are not observing these symptoms, you should assume that it is safe to say Y. +config ARM64_SME + bool "ARM Scalable Matrix Extension support" + default y + depends on ARM64_SVE + help + The Scalable Matrix Extension (SME) is an extension to the AArch64 + execution state which utilises a substantial subset of the SVE + instruction set, together with the addition of new architectural + register state capable of holding two dimensional matrix tiles to + enable various matrix operations. + config ARM64_MODULE_PLTS bool "Use PLTs to allow module memory to spill over into vmalloc area" depends on MODULES From patchwork Wed Jan 26 15:11:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537360 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E547BC2BA4C for ; Wed, 26 Jan 2022 15:13:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242625AbiAZPNg (ORCPT ); Wed, 26 Jan 2022 10:13:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235868AbiAZPNg (ORCPT ); Wed, 26 Jan 2022 10:13:36 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 011E8C06161C for ; Wed, 26 Jan 2022 07:13:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BF5CFB81E15 for ; Wed, 26 Jan 2022 15:13:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4E78C340E3; Wed, 26 Jan 2022 15:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210013; bh=P2l5Y5PDPDJcPc6k74icHUD6sRurj6ctKkCgiE7n3uE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FGCKLwFG1LDFc2UKYhHR3wtHYWCSs8ISpVIJ8Y3A2wJViEVWWYOTmYMjF/hkwbXov b63nJSzCG5b2wQimooFAh/zxpPU08oO7ElB/B+DE5Zb8VOrzicn8fXQkcCbSUUDL6z ogylDfChH+30dEn4NLvU3Q953AOtzn4YTKXQIeotsxZelngRCmVk7N8+FN5xcN+eC/ 4vXZzJxgzmFRV6Slr923Lyk9rRslgyTpfRhieoS/EXY2CmgWRja0kmFO8/ZODzmvhq 9enhc/KZPchMitDxvX8vSvsnLBb0/rP4GEbJCnGnSFqn+NBYzNB/8AGUIWzzx+ATUV VYAOi6OO1HFCA== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 29/40] kselftest/arm64: sme: Add streaming SME support to vlset Date: Wed, 26 Jan 2022 15:11:09 +0000 Message-Id: <20220126151120.3811248-30-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1924; h=from:subject; bh=P2l5Y5PDPDJcPc6k74icHUD6sRurj6ctKkCgiE7n3uE=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSNOiCWCuqdEhEwDeB0fuhFIzO/DqyGIULZWFjy SmyMmrCJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkjQAKCRAk1otyXVSH0FDQB/ wJl2jDnNO26RagzF9Cq7elJL/aRx6EGGh2FQQ2jVwBFPnFsALroxhghHtBMciIUDTl/CyQdxPCEOPI WHjLnDWxewrmuylT0r9OG6pTjI1sXoVzjQ/uFqa/opmrKVLpKqva3BzCzQKgUt3IHXZObaQTTkTfj7 AYJ+9+meEPsHjVPnVk+p5x3jIZTGMEPknWI+E3b4ZAuzcFyXzDAF0YzKUMb6E6cRoS6ZSMEX82X/oA LgdO3ibrbIzlOPLVmdcL/LboEzy0zBz4FfnV/xiBZ4HqW6rKzlvOLzAyk/ndUEJfR/5DZTcRdK4Szq CusCDpI6GA+TpAnVjJFEqru2ECQb6T X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The Scalable Matrix Extenions (SME) introduces additional register state with configurable vector lengths, similar to SVE but configured separately. Extend vlset to support configuring this state with a --sme or -s command line option. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/vlset.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/arm64/fp/vlset.c b/tools/testing/selftests/arm64/fp/vlset.c index 308d27a68226..76912a581a95 100644 --- a/tools/testing/selftests/arm64/fp/vlset.c +++ b/tools/testing/selftests/arm64/fp/vlset.c @@ -22,12 +22,15 @@ static int inherit = 0; static int no_inherit = 0; static int force = 0; static unsigned long vl; +static int set_ctl = PR_SVE_SET_VL; +static int get_ctl = PR_SVE_GET_VL; static const struct option options[] = { { "force", no_argument, NULL, 'f' }, { "inherit", no_argument, NULL, 'i' }, { "max", no_argument, NULL, 'M' }, { "no-inherit", no_argument, &no_inherit, 1 }, + { "sme", no_argument, NULL, 's' }, { "help", no_argument, NULL, '?' }, {} }; @@ -50,6 +53,9 @@ static int parse_options(int argc, char **argv) case 'M': vl = SVE_VL_MAX; break; case 'f': force = 1; break; case 'i': inherit = 1; break; + case 's': set_ctl = PR_SME_SET_VL; + get_ctl = PR_SME_GET_VL; + break; case 0: break; default: goto error; } @@ -125,14 +131,14 @@ int main(int argc, char **argv) if (inherit) flags |= PR_SVE_VL_INHERIT; - t = prctl(PR_SVE_SET_VL, vl | flags); + t = prctl(set_ctl, vl | flags); if (t < 0) { fprintf(stderr, "%s: PR_SVE_SET_VL: %s\n", program_name, strerror(errno)); goto error; } - t = prctl(PR_SVE_GET_VL); + t = prctl(get_ctl); if (t == -1) { fprintf(stderr, "%s: PR_SVE_GET_VL: %s\n", program_name, strerror(errno)); From patchwork Wed Jan 26 15:11:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C71C3C2BA4C for ; Wed, 26 Jan 2022 15:13:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242607AbiAZPNk (ORCPT ); Wed, 26 Jan 2022 10:13:40 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:41820 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235868AbiAZPNj (ORCPT ); Wed, 26 Jan 2022 10:13:39 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BFBBFB81978 for ; Wed, 26 Jan 2022 15:13:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 04164C340EB; Wed, 26 Jan 2022 15:13:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210017; bh=fs6Mo/cNVCXgit6VI25SiK2O86CoOZYuGBLTQAR7ou4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pErGvnuztsHGvB9V8n18kB2xGImqKKlHDr5e5A/e7tZVWVNT4b9jQyTUsXKW7SVCx E3oPfhPT5pw12NL+A+L+kpmp2r7BwjPDf22VRczH5FhXEZ7lKQJEEZk63+UHgELGfy yABmk+VL5Obc5ViXE67GEgSpYScNjMx1AnI0AM2Btdmett1LYG5wTM+1RK3ZcTnz2J ILmKCNjS97EuaS2kdOvGsBmvscSHZH50JkjUFn2kEWf/i1g/w2vsxo1+Sy+Y8Xa4Tz O5K07eh7jurEwCxdqCLZ5sxKSQzFMhX7Ol0p5IfANPMHVeOQ2jUSVoiq8xe2RKFvkC 3gtMEKkVnr3yw== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 30/40] kselftest/arm64: Add tests for TPIDR2 Date: Wed, 26 Jan 2022 15:11:10 +0000 Message-Id: <20220126151120.3811248-31-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=8971; h=from:subject; bh=fs6Mo/cNVCXgit6VI25SiK2O86CoOZYuGBLTQAR7ou4=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSOYfrKO9wqrVDyN1/vTE4zIqML+ldRAu4nWsi8 L63DQwiJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkjgAKCRAk1otyXVSH0JSWB/ 9yHXt3fqmFQAGVI165FJk32uX+wIssvPJVV/byhvzzVkCZIbmITtzvHY3DvB9A3ng6BYCGOhWfoMG7 YnFWDx5oqOVG1fAw7TcAstBddPPU0BoZd4dltkHLCcZYSLzruTAlWPHudhRXAnmMrzhY1k+naJ0WxS 9vyCuwWuwFLOvBoBw4MHQSg4wRK4d6ruXcq0EDAuPGO4tRih6833JjK6yeT7yn8wjUj+EI1XT6SiSm D6pGPHqzlu8LGZ54c22xje+B8XyIeChbi+mm7l1Lj1HDSk5VsPffamrC6VsFDG3ok/Hc9Fh+sFzfXx XtEjQPLy06xJs8iw8j9/V/fO0O1zka X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The Scalable Matrix Extension adds a new system register TPIDR2 intended to be used by libc for its own thread specific use, add some kselftests which exercise the ABI for it. Since this test should with some adjustment work for TPIDR and any other similar registers added in future add tests for it in a separate directory rather than placing it with the other floating point tests, nothing existing looked suitable so I created a new test directory called "abi". Since this feature is intended to be used by libc the test is built as freestanding code using nolibc so we don't end up with the test program and libc both trying to manage the register simultaneously and distrupting each other. As a result of being written using nolibc rather than using hwcaps to identify if SME is available in the system we check for the default SME vector length configuration in proc, adding hwcap support to nolibc seems like disproportionate effort and didn't feel entirely idiomatic for what nolibc is trying to do. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/abi/.gitignore | 1 + tools/testing/selftests/arm64/abi/Makefile | 9 +- tools/testing/selftests/arm64/abi/tpidr2.c | 298 +++++++++++++++++++ 3 files changed, 307 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/arm64/abi/tpidr2.c diff --git a/tools/testing/selftests/arm64/abi/.gitignore b/tools/testing/selftests/arm64/abi/.gitignore index b79cf5814c23..b9e54417250d 100644 --- a/tools/testing/selftests/arm64/abi/.gitignore +++ b/tools/testing/selftests/arm64/abi/.gitignore @@ -1 +1,2 @@ syscall-abi +tpidr2 diff --git a/tools/testing/selftests/arm64/abi/Makefile b/tools/testing/selftests/arm64/abi/Makefile index 96eba974ac8d..c8d7f2495eb2 100644 --- a/tools/testing/selftests/arm64/abi/Makefile +++ b/tools/testing/selftests/arm64/abi/Makefile @@ -1,8 +1,15 @@ # SPDX-License-Identifier: GPL-2.0 # Copyright (C) 2021 ARM Limited -TEST_GEN_PROGS := syscall-abi +TEST_GEN_PROGS := syscall-abi tpidr2 include ../../lib.mk $(OUTPUT)/syscall-abi: syscall-abi.c syscall-abi-asm.S + +# Build with nolibc since TPIDR2 is intended to be actively managed by +# libc and we're trying to test the functionality that it depends on here. +$(OUTPUT)/tpidr2: tpidr2.c + $(CC) -fno-asynchronous-unwind-tables -fno-ident -s -Os -nostdlib \ + -static -include ../../../../include/nolibc/nolibc.h \ + -ffreestanding -Wall $^ -o $@ -lgcc diff --git a/tools/testing/selftests/arm64/abi/tpidr2.c b/tools/testing/selftests/arm64/abi/tpidr2.c new file mode 100644 index 000000000000..351a098b503a --- /dev/null +++ b/tools/testing/selftests/arm64/abi/tpidr2.c @@ -0,0 +1,298 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +#define SYS_TPIDR2 "S3_3_C13_C0_5" + +#define EXPECTED_TESTS 5 + +static void putstr(const char *str) +{ + write(1, str, strlen(str)); +} + +static void putnum(unsigned int num) +{ + char c; + + if (num / 10) + putnum(num / 10); + + c = '0' + (num % 10); + write(1, &c, 1); +} + +static int tests_run; +static int tests_passed; +static int tests_failed; +static int tests_skipped; + +static void set_tpidr2(uint64_t val) +{ + asm volatile ( + "msr " SYS_TPIDR2 ", %0\n" + : + : "r"(val) + : "cc"); +} + +static uint64_t get_tpidr2(void) +{ + uint64_t val; + + asm volatile ( + "mrs %0, " SYS_TPIDR2 "\n" + : "=r"(val) + : + : "cc"); + + return val; +} + +static void print_summary(void) +{ + if (tests_passed + tests_failed + tests_skipped != EXPECTED_TESTS) + putstr("# UNEXPECTED TEST COUNT: "); + + putstr("# Totals: pass:"); + putnum(tests_passed); + putstr(" fail:"); + putnum(tests_failed); + putstr(" xfail:0 xpass:0 skip:"); + putnum(tests_skipped); + putstr(" error:0\n"); +} + +/* Processes should start with TPIDR2 == 0 */ +static int default_value(void) +{ + return get_tpidr2() == 0; +} + +/* If we set TPIDR2 we should read that value */ +static int write_read(void) +{ + set_tpidr2(getpid()); + + return getpid() == get_tpidr2(); +} + +/* If we set a value we should read the same value after scheduling out */ +static int write_sleep_read(void) +{ + set_tpidr2(getpid()); + + msleep(100); + + return getpid() == get_tpidr2(); +} + +/* + * If we fork the value in the parent should be unchanged and the + * child should start with the same value and be able to set its own + * value. + */ +static int write_fork_read(void) +{ + pid_t newpid, waiting, oldpid; + int status; + + set_tpidr2(getpid()); + + oldpid = getpid(); + newpid = fork(); + if (newpid == 0) { + /* In child */ + if (get_tpidr2() != oldpid) { + putstr("# TPIDR2 changed in child: "); + putnum(get_tpidr2()); + putstr("\n"); + exit(0); + } + + set_tpidr2(getpid()); + if (get_tpidr2() == getpid()) { + exit(1); + } else { + putstr("# Failed to set TPIDR2 in child\n"); + exit(0); + } + } + if (newpid < 0) { + putstr("# fork() failed: -"); + putnum(-newpid); + putstr("\n"); + return 0; + } + + for (;;) { + waiting = waitpid(newpid, &status, 0); + + if (waiting < 0) { + if (errno == EINTR) + continue; + putstr("# waitpid() failed: "); + putnum(errno); + putstr("\n"); + return 0; + } + if (waiting != newpid) { + putstr("# waitpid() returned wrong PID\n"); + return 0; + } + + if (!WIFEXITED(status)) { + putstr("# child did not exit\n"); + return 0; + } + + if (getpid() != get_tpidr2()) { + putstr("# TPIDR2 corrupted in parent\n"); + return 0; + } + + return WEXITSTATUS(status); + } +} + +/* + * sys_clone() has a lot of per architecture variation so just define + * it here rather than adding it to nolibc, plus the raw API is a + * little more convenient for this test. + */ +static int sys_clone(unsigned long clone_flags, unsigned long newsp, + int *parent_tidptr, unsigned long tls, + int *child_tidptr) +{ + return my_syscall5(__NR_clone, clone_flags, newsp, parent_tidptr, tls, + child_tidptr); +} + +/* + * If we clone with CLONE_SETTLS then the value in the parent should + * be unchanged and the child should start with zero and be able to + * set its own value. + */ +static int write_clone_read(void) +{ + int parent_tid, child_tid; + pid_t parent, waiting; + int ret, status; + + parent = getpid(); + set_tpidr2(parent); + + ret = sys_clone(CLONE_SETTLS, 0, &parent_tid, 0, &child_tid); + if (ret == -1) { + putstr("# clone() failed\n"); + putnum(errno); + putstr("\n"); + return 0; + } + + if (ret == 0) { + /* In child */ + if (get_tpidr2() != 0) { + putstr("# TPIDR2 non-zero in child: "); + putnum(get_tpidr2()); + putstr("\n"); + exit(0); + } + + if (gettid() == 0) + putstr("# Child TID==0\n"); + set_tpidr2(gettid()); + if (get_tpidr2() == gettid()) { + exit(1); + } else { + putstr("# Failed to set TPIDR2 in child\n"); + exit(0); + } + } + + for (;;) { + waiting = wait4(ret, &status, __WCLONE, NULL); + + if (waiting < 0) { + if (errno == EINTR) + continue; + putstr("# wait4() failed: "); + putnum(errno); + putstr("\n"); + return 0; + } + if (waiting != ret) { + putstr("# wait4() returned wrong PID "); + putnum(waiting); + putstr("\n"); + return 0; + } + + if (!WIFEXITED(status)) { + putstr("# child did not exit\n"); + return 0; + } + + if (parent != get_tpidr2()) { + putstr("# TPIDR2 corrupted in parent\n"); + return 0; + } + + return WEXITSTATUS(status); + } +} + +#define run_test(name) \ + if (name()) { \ + tests_passed++; \ + } else { \ + tests_failed++; \ + putstr("not "); \ + } \ + putstr("ok "); \ + putnum(++tests_run); \ + putstr(" " #name "\n"); + +int main(int argc, char **argv) +{ + int ret, i; + + putstr("TAP version 13\n"); + putstr("1.."); + putnum(EXPECTED_TESTS); + putstr("\n"); + + putstr("# PID: "); + putnum(getpid()); + putstr("\n"); + + /* + * This test is run with nolibc which doesn't support hwcap and + * it's probably disproportionate to implement so instead check + * for the default vector length configuration in /proc. + */ + ret = open("/proc/sys/abi/sme_default_vector_length", O_RDONLY, 0); + if (ret >= 0) { + run_test(default_value); + run_test(write_read); + run_test(write_sleep_read); + run_test(write_fork_read); + run_test(write_clone_read); + + } else { + putstr("# SME support not present\n"); + + for (i = 0; i < EXPECTED_TESTS; i++) { + putstr("ok "); + putnum(i); + putstr(" skipped, TPIDR2 not supported\n"); + } + + tests_skipped += EXPECTED_TESTS; + } + + print_summary(); + + return 0; +} From patchwork Wed Jan 26 15:11:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D69DC2BA4C for ; Wed, 26 Jan 2022 15:13:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235868AbiAZPNo (ORCPT ); Wed, 26 Jan 2022 10:13:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233709AbiAZPNn (ORCPT ); Wed, 26 Jan 2022 10:13:43 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7038FC06161C for ; Wed, 26 Jan 2022 07:13:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 00B6C61463 for ; Wed, 26 Jan 2022 15:13:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3505AC340E8; Wed, 26 Jan 2022 15:13:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210022; bh=3il89y80lqKSeBlP8c8K13FChN+0Rfp1qjkKOqCKgPI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FWEnz3EIiHlmvbB7rQeaTws4pfN8vyb893hZehdNgL0TiAEWo2Mj99nefj7g/4FjL qjHEIFce8Rmkg2U/LjdEYkyuYygobDXqKiPmMdW8jGZpQp+eQqvEDGecYSMqv0SzcN 9tua8FDIbyh2B3pvkpCQxfZir8AW1Or/C/axfcuvvHIHOUWhQwedPgDfKvZErgksUy gEpTVxBME6urtRSDBO5dMwfZTmgHfMGWJ1tpDPc4eKnvPFiqDHPG9UIdy4m3O4xPlQ c5TZ+WHGBts5m/KracU2ruCizN5pwcwTd8HbMDAwRwY/b0XeZsC/ZpiQnRu9726nLX 9nLPQTMQTn+7Q== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 31/40] kselftest/arm64: Extend vector configuration API tests to cover SME Date: Wed, 26 Jan 2022 15:11:11 +0000 Message-Id: <20220126151120.3811248-32-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3696; h=from:subject; bh=3il89y80lqKSeBlP8c8K13FChN+0Rfp1qjkKOqCKgPI=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSPN4dR74Qw9LPAlbaGHQMLY9KQX+QsTH0nXuoi YlK5KcGJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkjwAKCRAk1otyXVSH0Mz4B/ 9tDFztfdOBzS3ehx8vyMgbf30A/qNcQPmY/ZdZ472gE9Li/AzhSNI8RQUW6X66icjk4x+6P3ypR4SQ e4TFRyI94rAQsmexQYEZSZNsbNaESW2rukTItfJKm2vJLjEJel0jFXmlaizAU8NqwBHpPizR+Ky96l P4tuu2nsCiFsVBnrdzE+YBlsIp/e+B1zU/FyYH0H5o9LbnaqqXHu7c5UlCV5Yl9zHM0uBSoaGvobEB EM9vdcAOrhrtdHMY+V4fnllyjgWvFeqZsoikKrSaPxZqvv3COi8CgVjvNX7IyttRDN0kKjvLlt2FnI VI6IS86/KKuq1BesENOYTLImnYhrUo X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Provide RDVL helpers for SME and extend the main vector configuration tests to cover SME. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/.gitignore | 1 + tools/testing/selftests/arm64/fp/Makefile | 3 ++- tools/testing/selftests/arm64/fp/rdvl-sme.c | 14 ++++++++++++++ tools/testing/selftests/arm64/fp/rdvl.S | 8 ++++++++ tools/testing/selftests/arm64/fp/rdvl.h | 1 + tools/testing/selftests/arm64/fp/vec-syscfg.c | 10 ++++++++++ 6 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/arm64/fp/rdvl-sme.c diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore index c50d86331ed2..6e9a610c5e5d 100644 --- a/tools/testing/selftests/arm64/fp/.gitignore +++ b/tools/testing/selftests/arm64/fp/.gitignore @@ -1,5 +1,6 @@ fp-pidbench fpsimd-test +rdvl-sme rdvl-sve sve-probe-vls sve-ptrace diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile index 95f0b877a060..a224fff8082b 100644 --- a/tools/testing/selftests/arm64/fp/Makefile +++ b/tools/testing/selftests/arm64/fp/Makefile @@ -3,7 +3,7 @@ CFLAGS += -I../../../../../usr/include/ TEST_GEN_PROGS := sve-ptrace sve-probe-vls vec-syscfg TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \ - rdvl-sve \ + rdvl-sme rdvl-sve \ sve-test sve-stress \ vlset @@ -13,6 +13,7 @@ fp-pidbench: fp-pidbench.S asm-utils.o $(CC) -nostdlib $^ -o $@ fpsimd-test: fpsimd-test.o asm-utils.o $(CC) -nostdlib $^ -o $@ +rdvl-sme: rdvl-sme.o rdvl.o rdvl-sve: rdvl-sve.o rdvl.o sve-ptrace: sve-ptrace.o sve-probe-vls: sve-probe-vls.o rdvl.o diff --git a/tools/testing/selftests/arm64/fp/rdvl-sme.c b/tools/testing/selftests/arm64/fp/rdvl-sme.c new file mode 100644 index 000000000000..49b0b2e08bac --- /dev/null +++ b/tools/testing/selftests/arm64/fp/rdvl-sme.c @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include + +#include "rdvl.h" + +int main(void) +{ + int vl = rdvl_sme(); + + printf("%d\n", vl); + + return 0; +} diff --git a/tools/testing/selftests/arm64/fp/rdvl.S b/tools/testing/selftests/arm64/fp/rdvl.S index c916c1c9defd..1a2ebf07a20c 100644 --- a/tools/testing/selftests/arm64/fp/rdvl.S +++ b/tools/testing/selftests/arm64/fp/rdvl.S @@ -8,3 +8,11 @@ rdvl_sve: hint 34 // BTI C rdvl x0, #1 ret + +.globl rdvl_sme +rdvl_sme: + hint 34 // BTI C + + .inst 0x4bf5820 // RDSVL x0, #1 + + ret diff --git a/tools/testing/selftests/arm64/fp/rdvl.h b/tools/testing/selftests/arm64/fp/rdvl.h index 7c9d953fc9e7..5d323679fbc9 100644 --- a/tools/testing/selftests/arm64/fp/rdvl.h +++ b/tools/testing/selftests/arm64/fp/rdvl.h @@ -3,6 +3,7 @@ #ifndef RDVL_H #define RDVL_H +int rdvl_sme(void); int rdvl_sve(void); #endif diff --git a/tools/testing/selftests/arm64/fp/vec-syscfg.c b/tools/testing/selftests/arm64/fp/vec-syscfg.c index c90658811a83..9bcfcdc34ee9 100644 --- a/tools/testing/selftests/arm64/fp/vec-syscfg.c +++ b/tools/testing/selftests/arm64/fp/vec-syscfg.c @@ -51,6 +51,16 @@ static struct vec_data vec_data[] = { .prctl_set = PR_SVE_SET_VL, .default_vl_file = "/proc/sys/abi/sve_default_vector_length", }, + { + .name = "SME", + .hwcap_type = AT_HWCAP2, + .hwcap = HWCAP2_SME, + .rdvl = rdvl_sme, + .rdvl_binary = "./rdvl-sme", + .prctl_get = PR_SME_GET_VL, + .prctl_set = PR_SME_SET_VL, + .default_vl_file = "/proc/sys/abi/sme_default_vector_length", + }, }; static int stdio_read_integer(FILE *f, const char *what, int *val) From patchwork Wed Jan 26 15:11:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1835C28CF5 for ; Wed, 26 Jan 2022 15:13:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242612AbiAZPNt (ORCPT ); Wed, 26 Jan 2022 10:13:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233709AbiAZPNs (ORCPT ); Wed, 26 Jan 2022 10:13:48 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC034C06161C for ; Wed, 26 Jan 2022 07:13:48 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4D3FA61376 for ; Wed, 26 Jan 2022 15:13:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD3A8C340EC; Wed, 26 Jan 2022 15:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210027; bh=bxmJiz4iJDucMw8+bb1RejhHVADIm4+kHNfyEyijSjk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MH/caVRnfOObfnnBhhgoaJ8CxCpgqpdjTiCIizA3bhxz9y3HrJTwc1RB+bmAfPR3f EAtkWiL7Q2mSqMYvYIorNPCNNGs8+ATAC12WvU+TEvlad8mnJ/p6FafKl/aM+lcOWU 9xrJIO17P7vTM37wUwFU9phLbUIX5lm8HVHw39MikUDfZP9K9LiLFB2FG+PsCtEvhw G1pQ8X0jKOv0uWFiT9ynfqUoaXhhRuukmrCZBZOp3gm88KDwwehvd8gpxNXRRL4MBk xYZbnYjdoYykwXn5xMQUbL4kH9PSMl9RgntepAVM9tF34qdji7oIfdGtUY8zQQt8xj /yGf6tFI98gfA== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 32/40] kselftest/arm64: sme: Provide streaming mode SVE stress test Date: Wed, 26 Jan 2022 15:11:12 +0000 Message-Id: <20220126151120.3811248-33-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5076; h=from:subject; bh=bxmJiz4iJDucMw8+bb1RejhHVADIm4+kHNfyEyijSjk=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSQsLor7YdTJhC+qpcUC09jl9a2LtWwVEU/ZOxU RA60NfaJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkkAAKCRAk1otyXVSH0Nz/B/ 4xQdZmXAyTIzhdlCiCfwzxl6QWQTuXzJkL3Dq2iQFDMokNSgabawdYQjcde3AwZbQk7YOqc7B98l7t IUE+IFavI8Ns4Z/E28/nZzfdWAdGhvCsmYoTKv0hPowzK72xRpx+AjHEVho6xS11LPTuSlLiHbYoqu WWPXJYHO6JKqNvN8k09yU/QdywZRqlERl4OWQXHbWj/lA8KcS84yZeMwssXqteQ4CnDJ42wJNmfh2k Jv6VqAoA0ZxVHNhffxjAyL3cS7FDACXLO4+SccI2/KyRROLduLSt/DSqmB7PP4NKyljRkHqa9SOcWo cqmAVmZq1N+e8quV8ivYcv1RElqldk X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org One of the features of SME is the addition of streaming mode, in which we have access to a set of streaming mode SVE registers at the SME vector length. Since these are accessed using the SVE instructions let's reuse the existing SVE stress test for testing with a compile time option for controlling the few small differences needed: - Enter streaming mode immediately on starting the program. - In streaming mode FFR is removed so skip reading and writing FFR. In order to avoid requiring a cutting edge toolchain with SME support use the op/CR form for specifying SVCR. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/.gitignore | 1 + tools/testing/selftests/arm64/fp/Makefile | 3 + tools/testing/selftests/arm64/fp/ssve-stress | 59 ++++++++++++++++++++ tools/testing/selftests/arm64/fp/sve-test.S | 30 ++++++++++ 4 files changed, 93 insertions(+) create mode 100644 tools/testing/selftests/arm64/fp/ssve-stress diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore index 6e9a610c5e5d..5729a5b1adfc 100644 --- a/tools/testing/selftests/arm64/fp/.gitignore +++ b/tools/testing/selftests/arm64/fp/.gitignore @@ -5,5 +5,6 @@ rdvl-sve sve-probe-vls sve-ptrace sve-test +ssve-test vec-syscfg vlset diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile index a224fff8082b..e6643c9b0474 100644 --- a/tools/testing/selftests/arm64/fp/Makefile +++ b/tools/testing/selftests/arm64/fp/Makefile @@ -5,6 +5,7 @@ TEST_GEN_PROGS := sve-ptrace sve-probe-vls vec-syscfg TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \ rdvl-sme rdvl-sve \ sve-test sve-stress \ + ssve-test ssve-stress \ vlset all: $(TEST_GEN_PROGS) $(TEST_PROGS_EXTENDED) @@ -19,6 +20,8 @@ sve-ptrace: sve-ptrace.o sve-probe-vls: sve-probe-vls.o rdvl.o sve-test: sve-test.o asm-utils.o $(CC) -nostdlib $^ -o $@ +ssve-test: sve-test.S asm-utils.o + $(CC) -DSSVE -nostdlib $^ -o $@ vec-syscfg: vec-syscfg.o rdvl.o vlset: vlset.o diff --git a/tools/testing/selftests/arm64/fp/ssve-stress b/tools/testing/selftests/arm64/fp/ssve-stress new file mode 100644 index 000000000000..e2bd2cc184ad --- /dev/null +++ b/tools/testing/selftests/arm64/fp/ssve-stress @@ -0,0 +1,59 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-only +# Copyright (C) 2015-2019 ARM Limited. +# Original author: Dave Martin + +set -ue + +NR_CPUS=`nproc` + +pids= +logs= + +cleanup () { + trap - INT TERM CHLD + set +e + + if [ -n "$pids" ]; then + kill $pids + wait $pids + pids= + fi + + if [ -n "$logs" ]; then + cat $logs + rm $logs + logs= + fi +} + +interrupt () { + cleanup + exit 0 +} + +child_died () { + cleanup + exit 1 +} + +trap interrupt INT TERM EXIT + +for x in `seq 0 $((NR_CPUS * 4))`; do + log=`mktemp` + logs=$logs\ $log + ./ssve-test >$log & + pids=$pids\ $! +done + +# Wait for all child processes to be created: +sleep 10 + +while :; do + kill -USR1 $pids +done & +pids=$pids\ $! + +wait + +exit 1 diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S index f5b1b48ffff2..31764e8370db 100644 --- a/tools/testing/selftests/arm64/fp/sve-test.S +++ b/tools/testing/selftests/arm64/fp/sve-test.S @@ -156,6 +156,7 @@ endfunction // We fill the upper lanes of FFR with zeros. // Beware: corrupts P0. function setup_ffr +#ifndef SSVE mov x4, x30 and w0, w0, #0x3 @@ -178,6 +179,9 @@ function setup_ffr wrffr p0.b ret x4 +#else + ret +#endif endfunction // Trivial memory compare: compare x2 bytes starting at address x0 with @@ -260,6 +264,7 @@ endfunction // Beware -- corrupts P0. // Clobbers x0-x5. function check_ffr +#ifndef SSVE mov x3, x30 ldr x4, =scratch @@ -280,6 +285,9 @@ function check_ffr mov x2, x5 mov x30, x3 b memcmp +#else + ret +#endif endfunction // Any SVE register modified here can cause corruption in the main @@ -295,13 +303,26 @@ function irritator_handler movi v0.8b, #1 movi v9.16b, #2 movi v31.8b, #3 +#ifndef SSVE // And P0 rdffr p0.b // And FFR wrffr p15.b +#endif + + ret +endfunction + +#ifdef SSVE +function enable_sm + // Set SVCR.SM to 1, equivalent to SMSTART SM but doesn't need a + // SME capable toolchain. + mov x0, #1 + msr S3_3_C4_C2_2, x0 ret endfunction +#endif function terminate_handler mov w21, w0 @@ -359,6 +380,11 @@ endfunction .globl _start function _start _start: +#ifdef SSVE + puts "Streaming mode " + bl enable_sm +#endif + // Sanity-check and report the vector length rdvl x19, #8 @@ -407,6 +433,10 @@ _start: orr w2, w2, #SA_NODEFER bl setsignal +#ifdef SSVE + bl enable_sm // syscalls will have exited streaming mode +#endif + mov x22, #0 // generation number, increments per iteration .Ltest_loop: rdvl x0, #8 From patchwork Wed Jan 26 15:11:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 330DAC28CF5 for ; Wed, 26 Jan 2022 15:13:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235850AbiAZPNy (ORCPT ); Wed, 26 Jan 2022 10:13:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233709AbiAZPNy (ORCPT ); Wed, 26 Jan 2022 10:13:54 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53AD6C06161C for ; Wed, 26 Jan 2022 07:13:54 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E6FCC617B4 for ; Wed, 26 Jan 2022 15:13:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D877C340E8; Wed, 26 Jan 2022 15:13:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210033; bh=HZPqgEGfMDOJRbR9O3affXiwGZAfPzgR37m4iq1abJY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aHPHxcSAlslb11pCOwl3Jo+DRTJxFbC+n3Xfdp9VIR+MX6Wxn5hBjvQ/NRn+1DlsL /T7cOObray3K3N8jD9JyKIuhTTxcQeU2xpHeTVbWE90QnibLlMqqqwGiQEgaRLFOIy 4KNU4/7xNHbSMZvhzCrS/RvC19/TM2m+VULyo4+PEkBEGIAGKcFfj49bNyHSGgLEp5 JvQqwUaD+QNfj681jJy4FPUGgjXCvR2+NrBeP0i+OF2NVrEtR1OCrEVIOPGo4P6GV9 8QdemV2FFCQEj1OlO1MRMHWqm+fqkrptISS250R+mGLGyMzMLqp6SJ1excOmwLcBvH dcc8P7Onk1q1A== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 33/40] kselftest/arm64: signal: Allow tests to be incompatible with features Date: Wed, 26 Jan 2022 15:11:13 +0000 Message-Id: <20220126151120.3811248-34-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3936; h=from:subject; bh=HZPqgEGfMDOJRbR9O3affXiwGZAfPzgR37m4iq1abJY=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSRpfEXhZMiZZEUpvipJ+vhpvj7BzhhYgFzTZOk zo69Z22JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkkQAKCRAk1otyXVSH0Lf3B/ 9Xg2sLa9D12aP+Oj3NXxL28J51+BSip6GT4MaLaJrm+vwjwgxNUuSi3GWZYCIE3oWuqaxhOg0PYPQy kn/3/o2X885AR7rP4YqgIHG4WGck5T68WkbJjrv7ocz58I1rZLT+jUaWkfhFoSJbTrh++bLubHs6Di QfWzsESxW/pypWZmQXswLblwX07lEFEUYJKOk0SEiYkR7wzNCgPG+Hlg23cKTxq51nIbYb6UmX8KJo lYGtVBEOfu5DI/9Rgyw4/jKN1Nyo+SQLOjFKJiavyHSEA8DonwaVTmeeDwvyQDEJTG3rcs1pjQWf/H rrX2E9g3MWBXc9dtSU82cc+1Cw9j4W X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Some features may invalidate some tests, for example by supporting an operation which would trap otherwise. Allow tests to list features that they are incompatible with so we can cover the case where a signal will be generated without disruption on systems where that won't happen. Signed-off-by: Mark Brown --- .../selftests/arm64/signal/test_signals.h | 1 + .../arm64/signal/test_signals_utils.c | 34 ++++++++++++++----- .../arm64/signal/test_signals_utils.h | 2 ++ 3 files changed, 28 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/arm64/signal/test_signals.h b/tools/testing/selftests/arm64/signal/test_signals.h index ebe8694dbef0..f909b70d9e98 100644 --- a/tools/testing/selftests/arm64/signal/test_signals.h +++ b/tools/testing/selftests/arm64/signal/test_signals.h @@ -53,6 +53,7 @@ struct tdescr { char *name; char *descr; unsigned long feats_required; + unsigned long feats_incompatible; /* bitmask of effectively supported feats: populated at run-time */ unsigned long feats_supported; bool initialized; diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.c b/tools/testing/selftests/arm64/signal/test_signals_utils.c index 2f8c23af3b5e..5743897984b0 100644 --- a/tools/testing/selftests/arm64/signal/test_signals_utils.c +++ b/tools/testing/selftests/arm64/signal/test_signals_utils.c @@ -36,6 +36,8 @@ static inline char *feats_to_string(unsigned long feats) { size_t flen = MAX_FEATS_SZ - 1; + feats_string[0] = '\0'; + for (int i = 0; i < FMAX_END; i++) { if (feats & (1UL << i)) { size_t tlen = strlen(feats_names[i]); @@ -256,7 +258,7 @@ int test_init(struct tdescr *td) td->minsigstksz = MINSIGSTKSZ; fprintf(stderr, "Detected MINSTKSIGSZ:%d\n", td->minsigstksz); - if (td->feats_required) { + if (td->feats_required || td->feats_incompatible) { td->feats_supported = 0; /* * Checking for CPU required features using both the @@ -267,15 +269,29 @@ int test_init(struct tdescr *td) if (getauxval(AT_HWCAP) & HWCAP_SVE) td->feats_supported |= FEAT_SVE; if (feats_ok(td)) { - fprintf(stderr, - "Required Features: [%s] supported\n", - feats_to_string(td->feats_required & - td->feats_supported)); + if (td->feats_required & td->feats_supported) + fprintf(stderr, + "Required Features: [%s] supported\n", + feats_to_string(td->feats_required & + td->feats_supported)); + if (!(td->feats_incompatible & td->feats_supported)) + fprintf(stderr, + "Incompatible Features: [%s] absent\n", + feats_to_string(td->feats_incompatible)); } else { - fprintf(stderr, - "Required Features: [%s] NOT supported\n", - feats_to_string(td->feats_required & - ~td->feats_supported)); + if ((td->feats_required & td->feats_supported) != + td->feats_supported) + fprintf(stderr, + "Required Features: [%s] NOT supported\n", + feats_to_string(td->feats_required & + ~td->feats_supported)); + if (td->feats_incompatible & td->feats_supported) + fprintf(stderr, + "Incompatible Features: [%s] supported\n", + feats_to_string(td->feats_incompatible & + ~td->feats_supported)); + + td->result = KSFT_SKIP; return 0; } diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.h b/tools/testing/selftests/arm64/signal/test_signals_utils.h index 6772b5c8d274..f3aa99ba67bb 100644 --- a/tools/testing/selftests/arm64/signal/test_signals_utils.h +++ b/tools/testing/selftests/arm64/signal/test_signals_utils.h @@ -18,6 +18,8 @@ void test_result(struct tdescr *td); static inline bool feats_ok(struct tdescr *td) { + if (td->feats_incompatible & td->feats_supported) + return false; return (td->feats_required & td->feats_supported) == td->feats_required; } From patchwork Wed Jan 26 15:11:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3839BC2BA4C for ; Wed, 26 Jan 2022 15:14:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242613AbiAZPOA (ORCPT ); Wed, 26 Jan 2022 10:14:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233709AbiAZPOA (ORCPT ); Wed, 26 Jan 2022 10:14:00 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30572C06161C for ; Wed, 26 Jan 2022 07:14:00 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E3A4FB81CA7 for ; Wed, 26 Jan 2022 15:13:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF281C340E3; Wed, 26 Jan 2022 15:13:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210037; bh=Sg4MO9YZQcbV+koV+CMJiKCz9E7cFOVY1Zqkeaooq10=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fzjKnwEHreMc66l3kPGF2+vKp9+l+5aCBgUCtayC9Q6KWdWIiYRnhLCsqAVrar6gs I0Cc9UXUO2TIFoHxqUI/BgYCuAmgINoMEN9R+n3B1REXOQb7iqKAK92nWCqyVWj+O7 4ojBK5zA8bYN0Sjw8ctnjKcjAm6a3N/MHd3T6uHMUQhTTyjOwXyPxZASn9BjmTtHBg FCbnFKHEEUY7ms6S5DTCNisPg8qpbdPxWmEMfMcC6uabSG9VBEvlTyOahEeuQt7n79 CsqAMazEyEzDG3SK+pbcpWU74VsfQJDDYt2QnUYFfmbdc4y3A+qBmgThAHRLnKjT0O D28NbUkflRaZw== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 34/40] kselftest/arm64: signal: Handle ZA signal context in core code Date: Wed, 26 Jan 2022 15:11:14 +0000 Message-Id: <20220126151120.3811248-35-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3220; h=from:subject; bh=Sg4MO9YZQcbV+koV+CMJiKCz9E7cFOVY1Zqkeaooq10=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSSvpNMvy7SNGRTsSL8cEuGKairNIi9+eypDo1i pNtel2SJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkkgAKCRAk1otyXVSH0HtyB/ 0fb2wWNAmkiyXylja9l9e/IhJEoZjNo3HUCsgVclmoYP+6KWun+khRbrl4sH4rRAID7KMvZS1K6/Ui kvFx1Wnr9OYUg2VAs/pEtrrMfonKvWkzM3/H877JYoNbXd3D9o+ppI9MrPa9VWdZA71UKx/Si1LnY4 tarREqW2Jc7kgpzq337zb1A8Ah/4eWCe89sDcPUv22ynxtbgAO7sSqM41ZS0dFOPiheYYwc5WD/eV5 8KHQyLMmMzmpeJ4STcGnvM6kXi17pnQzEXG/gsNp0a14t8g2LSFDE6oA6D+Zmmdn8L6jiLkIAIQnKJ DGOEMHUdbz6Ko47IVJVXQiB914NBy7 X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org As part of the generic code for signal handling test cases we parse all signal frames to make sure they have at least the basic form we expect and that there are no unexpected frames present in the signal context. Add coverage of the ZA signal frame to this code. Signed-off-by: Mark Brown --- .../arm64/signal/testcases/testcases.c | 36 +++++++++++++++++++ .../arm64/signal/testcases/testcases.h | 3 +- 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c b/tools/testing/selftests/arm64/signal/testcases/testcases.c index 8c2a57fc2f9c..84c36bee4d82 100644 --- a/tools/testing/selftests/arm64/signal/testcases/testcases.c +++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c @@ -75,6 +75,31 @@ bool validate_sve_context(struct sve_context *sve, char **err) return true; } +bool validate_za_context(struct za_context *za, char **err) +{ + /* Size will be rounded up to a multiple of 16 bytes */ + size_t regs_size + = ((ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za->vl)) + 15) / 16) * 16; + + if (!za || !err) + return false; + + /* Either a bare za_context or a za_context followed by regs data */ + if ((za->head.size != sizeof(struct za_context)) && + (za->head.size != regs_size)) { + *err = "bad size for ZA context"; + return false; + } + + if (!sve_vl_valid(za->vl)) { + *err = "SME VL in ZA context invalid"; + + return false; + } + + return true; +} + bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err) { bool terminated = false; @@ -82,6 +107,7 @@ bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err) int flags = 0; struct extra_context *extra = NULL; struct sve_context *sve = NULL; + struct za_context *za = NULL; struct _aarch64_ctx *head = (struct _aarch64_ctx *)uc->uc_mcontext.__reserved; @@ -120,6 +146,13 @@ bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err) sve = (struct sve_context *)head; flags |= SVE_CTX; break; + case ZA_MAGIC: + if (flags & ZA_CTX) + *err = "Multiple ZA_MAGIC"; + /* Size is validated in validate_za_context() */ + za = (struct za_context *)head; + flags |= ZA_CTX; + break; case EXTRA_MAGIC: if (flags & EXTRA_CTX) *err = "Multiple EXTRA_MAGIC"; @@ -165,6 +198,9 @@ bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err) if (flags & SVE_CTX) if (!validate_sve_context(sve, err)) return false; + if (flags & ZA_CTX) + if (!validate_za_context(za, err)) + return false; head = GET_RESV_NEXT_HEAD(head); } diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h b/tools/testing/selftests/arm64/signal/testcases/testcases.h index ad884c135314..49f1d5de7b5b 100644 --- a/tools/testing/selftests/arm64/signal/testcases/testcases.h +++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h @@ -16,7 +16,8 @@ #define FPSIMD_CTX (1 << 0) #define SVE_CTX (1 << 1) -#define EXTRA_CTX (1 << 2) +#define ZA_CTX (1 << 2) +#define EXTRA_CTX (1 << 3) #define KSFT_BAD_MAGIC 0xdeadbeef From patchwork Wed Jan 26 15:11:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EF74C28CF5 for ; Wed, 26 Jan 2022 15:14:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235888AbiAZPOD (ORCPT ); Wed, 26 Jan 2022 10:14:03 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:41992 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233709AbiAZPOC (ORCPT ); Wed, 26 Jan 2022 10:14:02 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8802061376 for ; Wed, 26 Jan 2022 15:14:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 396CEC340EC; Wed, 26 Jan 2022 15:13:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210042; bh=ed64zhLtAbkpMXL5V51VDJr543h8QYko5WwjlEnjVQ0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jr006aG1RolHOuKSqnTqwJ/xVuM8/lHZs0PaS05InKkLvA+ue6oYvPb6EJ6+A0iam dzl3q/n1clhPrg/cE+Zx2nODFrbYQyxQdkj9KZu3Nc6jFsaeBwU6SuO/aRyqIwbIPl KXx7Jdkc/i6IQH+7AJbUBIxgn1iLNoRJ15mB7SNBr5uq4N0cwG7Nj0+V81B+0EB/zh JcKcbxy5HJozkgxYfaLgKSm/G3b5jbwRdKz9DzOog/56s4IvHjnA72LTf0AKRiOUsn 8pBmzJBdtVTBbmsvPqiBfxC7gv2SGNPZiWT0RfaMOCCDvfH/ZR+uQMU4eNHcwDGNjH kJZ90w5yESkpg== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 35/40] kselftest/arm64: Add stress test for SME ZA context switching Date: Wed, 26 Jan 2022 15:11:15 +0000 Message-Id: <20220126151120.3811248-36-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=11739; h=from:subject; bh=ed64zhLtAbkpMXL5V51VDJr543h8QYko5WwjlEnjVQ0=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSTdpG8Zl2J2EepYVKuvz25ObyOwtzpjxjcIJ6L skHDa+OJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkkwAKCRAk1otyXVSH0CqAB/ 9BFlvN4Es6YcFfs14DZk7LyDDJK6bH0FizNubJSvOHtofR0m0oA+naLwk5g9MFkct918Big35gHRD7 GjSF2y0xI1LUIWSwpjfOyeQtcpgovZaatisu1WeW0Kb3e0rHr6J7pHaicJlDwYL6KKUSVax/IslbDB OM2T4oDZlZbvqHMlTM4vHgdW6U6+S4p6rbwIWqKB+pjuHG5YAQjDuXyxO+utAF81Pw6Jsfo0DzqAay u2xHDVPRWOfgCxvbEKZPnjzv4eWQbU41+hgjgyTBAn/E/+rQAAPpTISzSf39gXh9KghzmNedE2jgKn Va6egpOrjwACDS6MrVW/H4C1C5QYam X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add a stress test for context switching of the ZA register state based on the similar tests Dave Martin wrote for FPSIMD and SVE registers. The test loops indefinitely writing a data pattern to ZA then reading it back and verifying that it's what was expected. Unlike the other tests we manually assemble the SME instructions since at present no released toolchain has SME support integrated. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/.gitignore | 1 + tools/testing/selftests/arm64/fp/Makefile | 3 + tools/testing/selftests/arm64/fp/za-stress | 59 +++ tools/testing/selftests/arm64/fp/za-test.S | 426 ++++++++++++++++++++ 4 files changed, 489 insertions(+) create mode 100644 tools/testing/selftests/arm64/fp/za-stress create mode 100644 tools/testing/selftests/arm64/fp/za-test.S diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore index 5729a5b1adfc..ead3197e720b 100644 --- a/tools/testing/selftests/arm64/fp/.gitignore +++ b/tools/testing/selftests/arm64/fp/.gitignore @@ -8,3 +8,4 @@ sve-test ssve-test vec-syscfg vlset +za-test diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile index e6643c9b0474..38d2d0d5a0eb 100644 --- a/tools/testing/selftests/arm64/fp/Makefile +++ b/tools/testing/selftests/arm64/fp/Makefile @@ -6,6 +6,7 @@ TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \ rdvl-sme rdvl-sve \ sve-test sve-stress \ ssve-test ssve-stress \ + za-test za-stress \ vlset all: $(TEST_GEN_PROGS) $(TEST_PROGS_EXTENDED) @@ -24,5 +25,7 @@ ssve-test: sve-test.S asm-utils.o $(CC) -DSSVE -nostdlib $^ -o $@ vec-syscfg: vec-syscfg.o rdvl.o vlset: vlset.o +za-test: za-test.o asm-utils.o + $(CC) -nostdlib $^ -o $@ include ../../lib.mk diff --git a/tools/testing/selftests/arm64/fp/za-stress b/tools/testing/selftests/arm64/fp/za-stress new file mode 100644 index 000000000000..5ac386b55b95 --- /dev/null +++ b/tools/testing/selftests/arm64/fp/za-stress @@ -0,0 +1,59 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-only +# Copyright (C) 2015-2019 ARM Limited. +# Original author: Dave Martin + +set -ue + +NR_CPUS=`nproc` + +pids= +logs= + +cleanup () { + trap - INT TERM CHLD + set +e + + if [ -n "$pids" ]; then + kill $pids + wait $pids + pids= + fi + + if [ -n "$logs" ]; then + cat $logs + rm $logs + logs= + fi +} + +interrupt () { + cleanup + exit 0 +} + +child_died () { + cleanup + exit 1 +} + +trap interrupt INT TERM EXIT + +for x in `seq 0 $((NR_CPUS * 4))`; do + log=`mktemp` + logs=$logs\ $log + ./za-test >$log & + pids=$pids\ $! +done + +# Wait for all child processes to be created: +sleep 10 + +while :; do + kill -USR1 $pids +done & +pids=$pids\ $! + +wait + +exit 1 diff --git a/tools/testing/selftests/arm64/fp/za-test.S b/tools/testing/selftests/arm64/fp/za-test.S new file mode 100644 index 000000000000..99e1bb3b6b18 --- /dev/null +++ b/tools/testing/selftests/arm64/fp/za-test.S @@ -0,0 +1,426 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (C) 2021 ARM Limited. +// Original author: Mark Brown +// +// Scalable Matrix Extension ZA context switch test +// Repeatedly writes unique test patterns into each ZA tile +// and reads them back to verify integrity. +// +// for x in `seq 1 NR_CPUS`; do sve-test & pids=$pids\ $! ; done +// (leave it running for as long as you want...) +// kill $pids + +#include +#include "assembler.h" +#include "asm-offsets.h" + +.arch_extension sve + +#define MAXVL 2048 +#define MAXVL_B (MAXVL / 8) + +/* + * LDR (vector to ZA array): + * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL] + */ +.macro _ldr_za nw, nxbase, offset=0 + .inst 0xe1000000 \ + | (((\nw) & 3) << 13) \ + | ((\nxbase) << 5) \ + | ((\offset) & 7) +.endm + +/* + * STR (vector from ZA array): + * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL] + */ +.macro _str_za nw, nxbase, offset=0 + .inst 0xe1200000 \ + | (((\nw) & 3) << 13) \ + | ((\nxbase) << 5) \ + | ((\offset) & 7) +.endm + +/* + * RDSVL X\nx, #\imm + */ +.macro rdsvl nx, imm + .inst 0x4bf5800 \ + | (\imm << 5) \ + | (\nx) +.endm + +.macro smstop + msr S0_3_C4_C6_3, xzr +.endm + +.macro smstart_za + msr S0_3_C4_C5_3, xzr +.endm + +// Declare some storage space to shadow ZA register contents and a +// scratch buffer for a vector. +.pushsection .text +.data +.align 4 +zaref: + .space MAXVL_B * MAXVL_B +scratch: + .space MAXVL_B +.popsection + +// Trivial memory copy: copy x2 bytes, starting at address x1, to address x0. +// Clobbers x0-x3 +function memcpy + cmp x2, #0 + b.eq 1f +0: ldrb w3, [x1], #1 + strb w3, [x0], #1 + subs x2, x2, #1 + b.ne 0b +1: ret +endfunction + +// Generate a test pattern for storage in ZA +// x0: pid +// x1: row in ZA +// x2: generation + +// These values are used to constuct a 32-bit pattern that is repeated in the +// scratch buffer as many times as will fit: +// bits 31:28 generation number (increments once per test_loop) +// bits 27:16 pid +// bits 15: 8 row number +// bits 7: 0 32-bit lane index + +function pattern + mov w3, wzr + bfi w3, w0, #16, #12 // PID + bfi w3, w1, #8, #8 // Row + bfi w3, w2, #28, #4 // Generation + + ldr x0, =scratch + mov w1, #MAXVL_B / 4 + +0: str w3, [x0], #4 + add w3, w3, #1 // Lane + subs w1, w1, #1 + b.ne 0b + + ret +endfunction + +// Get the address of shadow data for ZA horizontal vector xn +.macro _adrza xd, xn, nrtmp + ldr \xd, =zaref + rdsvl \nrtmp, 1 + madd \xd, x\nrtmp, \xn, \xd +.endm + +// Set up test pattern in a ZA horizontal vector +// x0: pid +// x1: row number +// x2: generation +function setup_za + mov x4, x30 + mov x12, x1 // Use x12 for vector select + + bl pattern // Get pattern in scratch buffer + _adrza x0, x12, 2 // Shadow buffer pointer to x0 and x5 + mov x5, x0 + ldr x1, =scratch + bl memcpy // length set up in x2 by _adrza + + _ldr_za 12, 5 // load vector w12 from pointer x5 + + ret x4 +endfunction + +// Trivial memory compare: compare x2 bytes starting at address x0 with +// bytes starting at address x1. +// Returns only if all bytes match; otherwise, the program is aborted. +// Clobbers x0-x5. +function memcmp + cbz x2, 2f + + stp x0, x1, [sp, #-0x20]! + str x2, [sp, #0x10] + + mov x5, #0 +0: ldrb w3, [x0, x5] + ldrb w4, [x1, x5] + add x5, x5, #1 + cmp w3, w4 + b.ne 1f + subs x2, x2, #1 + b.ne 0b + +1: ldr x2, [sp, #0x10] + ldp x0, x1, [sp], #0x20 + b.ne barf + +2: ret +endfunction + +// Verify that a ZA vector matches its shadow in memory, else abort +// x0: row number +// Clobbers x0-x7 and x12. +function check_za + mov x3, x30 + + mov x12, x0 + _adrza x5, x0, 6 // pointer to expected value in x5 + mov x4, x0 + ldr x7, =scratch // x7 is scratch + + mov x0, x7 // Poison scratch + mov x1, x6 + bl memfill_ae + + _str_za 12, 7 // save vector w12 to pointer x7 + + mov x0, x5 + mov x1, x7 + mov x2, x6 + mov x30, x3 + b memcmp +endfunction + +// Any SME register modified here can cause corruption in the main +// thread -- but *only* the locations modified here. +function irritator_handler + // Increment the irritation signal count (x23): + ldr x0, [x2, #ucontext_regs + 8 * 23] + add x0, x0, #1 + str x0, [x2, #ucontext_regs + 8 * 23] + + // Corrupt some random ZA data +#if 0 + adr x0, .text + (irritator_handler - .text) / 16 * 16 + movi v0.8b, #1 + movi v9.16b, #2 + movi v31.8b, #3 +#endif + + ret +endfunction + +function terminate_handler + mov w21, w0 + mov x20, x2 + + puts "Terminated by signal " + mov w0, w21 + bl putdec + puts ", no error, iterations=" + ldr x0, [x20, #ucontext_regs + 8 * 22] + bl putdec + puts ", signals=" + ldr x0, [x20, #ucontext_regs + 8 * 23] + bl putdecn + + mov x0, #0 + mov x8, #__NR_exit + svc #0 +endfunction + +// w0: signal number +// x1: sa_action +// w2: sa_flags +// Clobbers x0-x6,x8 +function setsignal + str x30, [sp, #-((sa_sz + 15) / 16 * 16 + 16)]! + + mov w4, w0 + mov x5, x1 + mov w6, w2 + + add x0, sp, #16 + mov x1, #sa_sz + bl memclr + + mov w0, w4 + add x1, sp, #16 + str w6, [x1, #sa_flags] + str x5, [x1, #sa_handler] + mov x2, #0 + mov x3, #sa_mask_sz + mov x8, #__NR_rt_sigaction + svc #0 + + cbz w0, 1f + + puts "sigaction failure\n" + b .Labort + +1: ldr x30, [sp], #((sa_sz + 15) / 16 * 16 + 16) + ret +endfunction + +// Main program entry point +.globl _start +function _start +_start: + puts "Streaming mode " + smstart_za + + // Sanity-check and report the vector length + + rdsvl 19, 8 + cmp x19, #128 + b.lo 1f + cmp x19, #2048 + b.hi 1f + tst x19, #(8 - 1) + b.eq 2f + +1: puts "bad vector length: " + mov x0, x19 + bl putdecn + b .Labort + +2: puts "vector length:\t" + mov x0, x19 + bl putdec + puts " bits\n" + + // Obtain our PID, to ensure test pattern uniqueness between processes + mov x8, #__NR_getpid + svc #0 + mov x20, x0 + + puts "PID:\t" + mov x0, x20 + bl putdecn + + mov x23, #0 // Irritation signal count + + mov w0, #SIGINT + adr x1, terminate_handler + mov w2, #SA_SIGINFO + bl setsignal + + mov w0, #SIGTERM + adr x1, terminate_handler + mov w2, #SA_SIGINFO + bl setsignal + + mov w0, #SIGUSR1 + adr x1, irritator_handler + mov w2, #SA_SIGINFO + orr w2, w2, #SA_NODEFER + bl setsignal + + mov x22, #0 // generation number, increments per iteration +.Ltest_loop: + rdsvl 0, 8 + cmp x0, x19 + b.ne vl_barf + + rdsvl 21, 1 // Set up ZA & shadow with test pattern +0: mov x0, x20 + sub x1, x21, #1 + mov x2, x22 + bl setup_za + subs x21, x21, #1 + b.ne 0b + + and x8, x22, #127 // Every 128 interations... + cbz x8, 0f + mov x8, #__NR_getpid // (otherwise minimal syscall) + b 1f +0: + mov x8, #__NR_sched_yield // ...encourage preemption +1: + svc #0 + + mrs x0, S3_3_C4_C2_2 // SVCR should have ZA=1,SM=0 + and x1, x0, #3 + cmp x1, #2 + b.ne svcr_barf + + rdsvl 21, 1 // Verify that the data made it through + rdsvl 24, 1 // Verify that the data made it through +0: sub x0, x24, x21 + bl check_za + subs x21, x21, #1 + bne 0b + + add x22, x22, #1 // Everything still working + b .Ltest_loop + +.Labort: + mov x0, #0 + mov x1, #SIGABRT + mov x8, #__NR_kill + svc #0 +endfunction + +function barf +// fpsimd.c acitivty log dump hack +// ldr w0, =0xdeadc0de +// mov w8, #__NR_exit +// svc #0 +// end hack + smstop + mov x10, x0 // expected data + mov x11, x1 // actual data + mov x12, x2 // data size + + puts "Mismatch: PID=" + mov x0, x20 + bl putdec + puts ", iteration=" + mov x0, x22 + bl putdec + puts ", row=" + mov x0, x21 + bl putdecn + puts "\tExpected [" + mov x0, x10 + mov x1, x12 + bl dumphex + puts "]\n\tGot [" + mov x0, x11 + mov x1, x12 + bl dumphex + puts "]\n" + + mov x8, #__NR_getpid + svc #0 +// fpsimd.c acitivty log dump hack +// ldr w0, =0xdeadc0de +// mov w8, #__NR_exit +// svc #0 +// ^ end of hack + mov x1, #SIGABRT + mov x8, #__NR_kill + svc #0 +// mov x8, #__NR_exit +// mov x1, #1 +// svc #0 +endfunction + +function vl_barf + mov x10, x0 + + puts "Bad active VL: " + mov x0, x10 + bl putdecn + + mov x8, #__NR_exit + mov x1, #1 + svc #0 +endfunction + +function svcr_barf + mov x10, x0 + + puts "Bad SVCR: " + mov x0, x10 + bl putdecn + + mov x8, #__NR_exit + mov x1, #1 + svc #0 +endfunction From patchwork Wed Jan 26 15:11:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51C45C2BA4C for ; Wed, 26 Jan 2022 15:14:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242626AbiAZPOI (ORCPT ); Wed, 26 Jan 2022 10:14:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233709AbiAZPOI (ORCPT ); Wed, 26 Jan 2022 10:14:08 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C229AC06161C for ; Wed, 26 Jan 2022 07:14:07 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 52A2161463 for ; Wed, 26 Jan 2022 15:14:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9FE38C340E8; Wed, 26 Jan 2022 15:14:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210046; bh=68+DVrIDwed146AFp7Wb4F5SkAk7GsLeX/otWxDBYJM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=D5yml8SD43W1FJTosOXt52dd8LuN1CQhc+WBaCSi6FLecE+q6LYUipiqOPg+pxWsA hgRRiB03C32Qy1ykKekXi6aoFZf2BYbXPsK6PmkeQUNidk7WvZyCRSPE9uiHheOmKl Ussg3ThkpM85sl5CNKx4jZyTOp8GIuEBfZcCCblMYSzrUm4ytVBgIregbW3jwDE22R 4oByBl8qMb36qmaBekY6BVYuMRMocdx0PFmxrnBG1yx/f0V9N/zRN59mEdXjIAFN1h PXnJxKAYLW6gSFD/aY6W6j/Oo8qKnQVmvxbn59ZY6iMOCz1Xslx0I4cnrSRHRe/CqG 4T6ZDJdGxEz1A== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 36/40] kselftest/arm64: signal: Add SME signal handling tests Date: Wed, 26 Jan 2022 15:11:16 +0000 Message-Id: <20220126151120.3811248-37-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=14953; h=from:subject; bh=68+DVrIDwed146AFp7Wb4F5SkAk7GsLeX/otWxDBYJM=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSTKeiwgPH3+W7T7snJgn2+mDYP8UWkWAcUN9rz /+F1YsGJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFkkwAKCRAk1otyXVSH0B4QB/ 0TTdsSHHZLx01OYpnASktuvXNhqfK/2h4NzXgzErOrB6hMtp5zP5LS9itBDqX/4uXqj0qnB0o8m3OH k0v648lOTDSTSUq1g/+9Iob9EO5zxE5GWY+6K4Xk0EEx0QYH7pvPnnw8AKR2+5bj2ox0Bq8Wp/k2XS xssdA/u0cchnljGfdl51x4NxknpzPlmwk2Thp72VCNG/lm7qHslyyG168ykjmlhXpPIlY9WSdByYto QQ31ldL2y2JomN2IR7eu0n1MrUDX0yO6U0+zP5QKgbU1mrTDT2zgBI8nwCWJdzzJBVtAYm1CkS+tRr WcYHZy/vtKjoHvrJX2F0BukTxfgAAY X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add test cases for the SME signal handing ABI patterned off the SVE tests. Due to the small size of the tests and the differences in ABI (especially around needing to account for both streaming SVE and ZA) there is some code duplication here. We currently cover: - Reporting of the vector length. - Lack of support for changing vector length. - Presence and size of register state for streaming SVE and ZA. As with the SVE tests we do not yet have any validation of register contents. Signed-off-by: Mark Brown --- .../testing/selftests/arm64/signal/.gitignore | 2 + .../selftests/arm64/signal/test_signals.h | 4 + .../arm64/signal/test_signals_utils.c | 6 + .../testcases/fake_sigreturn_sme_change_vl.c | 92 +++++++++++++ .../arm64/signal/testcases/sme_trap_no_sm.c | 38 ++++++ .../signal/testcases/sme_trap_non_streaming.c | 45 ++++++ .../arm64/signal/testcases/sme_trap_za.c | 36 +++++ .../selftests/arm64/signal/testcases/sme_vl.c | 68 +++++++++ .../arm64/signal/testcases/ssve_regs.c | 129 ++++++++++++++++++ 9 files changed, 420 insertions(+) create mode 100644 tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sme_change_vl.c create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_no_sm.c create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_non_streaming.c create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_za.c create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_vl.c create mode 100644 tools/testing/selftests/arm64/signal/testcases/ssve_regs.c diff --git a/tools/testing/selftests/arm64/signal/.gitignore b/tools/testing/selftests/arm64/signal/.gitignore index c1742755abb9..4de8eb26d4de 100644 --- a/tools/testing/selftests/arm64/signal/.gitignore +++ b/tools/testing/selftests/arm64/signal/.gitignore @@ -1,5 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only mangle_* fake_sigreturn_* +sme_* +ssve_* sve_* !*.[ch] diff --git a/tools/testing/selftests/arm64/signal/test_signals.h b/tools/testing/selftests/arm64/signal/test_signals.h index f909b70d9e98..c70fdec7d7c4 100644 --- a/tools/testing/selftests/arm64/signal/test_signals.h +++ b/tools/testing/selftests/arm64/signal/test_signals.h @@ -34,11 +34,15 @@ enum { FSSBS_BIT, FSVE_BIT, + FSME_BIT, + FSME_FA64_BIT, FMAX_END }; #define FEAT_SSBS (1UL << FSSBS_BIT) #define FEAT_SVE (1UL << FSVE_BIT) +#define FEAT_SME (1UL << FSME_BIT) +#define FEAT_SME_FA64 (1UL << FSME_FA64_BIT) /* * A descriptor used to describe and configure a test case. diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.c b/tools/testing/selftests/arm64/signal/test_signals_utils.c index 5743897984b0..b588d10afd5b 100644 --- a/tools/testing/selftests/arm64/signal/test_signals_utils.c +++ b/tools/testing/selftests/arm64/signal/test_signals_utils.c @@ -27,6 +27,8 @@ static int sig_copyctx = SIGTRAP; static char const *const feats_names[FMAX_END] = { " SSBS ", " SVE ", + " SME ", + " FA64 ", }; #define MAX_FEATS_SZ 128 @@ -268,6 +270,10 @@ int test_init(struct tdescr *td) td->feats_supported |= FEAT_SSBS; if (getauxval(AT_HWCAP) & HWCAP_SVE) td->feats_supported |= FEAT_SVE; + if (getauxval(AT_HWCAP2) & HWCAP2_SME) + td->feats_supported |= FEAT_SME; + if (getauxval(AT_HWCAP2) & HWCAP2_SME_FA64) + td->feats_supported |= FEAT_SME_FA64; if (feats_ok(td)) { if (td->feats_required & td->feats_supported) fprintf(stderr, diff --git a/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sme_change_vl.c b/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sme_change_vl.c new file mode 100644 index 000000000000..7ed762b7202f --- /dev/null +++ b/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sme_change_vl.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 ARM Limited + * + * Attempt to change the streaming SVE vector length in a signal + * handler, this is not supported and is expected to segfault. + */ + +#include +#include +#include + +#include "test_signals_utils.h" +#include "testcases.h" + +struct fake_sigframe sf; +static unsigned int vls[SVE_VQ_MAX]; +unsigned int nvls = 0; + +static bool sme_get_vls(struct tdescr *td) +{ + int vq, vl; + + /* + * Enumerate up to SVE_VQ_MAX vector lengths + */ + for (vq = SVE_VQ_MAX; vq > 0; --vq) { + vl = prctl(PR_SVE_SET_VL, vq * 16); + if (vl == -1) + return false; + + vl &= PR_SME_VL_LEN_MASK; + + /* Skip missing VLs */ + vq = sve_vq_from_vl(vl); + + vls[nvls++] = vl; + } + + /* We need at least two VLs */ + if (nvls < 2) { + fprintf(stderr, "Only %d VL supported\n", nvls); + return false; + } + + return true; +} + +static int fake_sigreturn_ssve_change_vl(struct tdescr *td, + siginfo_t *si, ucontext_t *uc) +{ + size_t resv_sz, offset; + struct _aarch64_ctx *head = GET_SF_RESV_HEAD(sf); + struct sve_context *sve; + + /* Get a signal context with a SME ZA frame in it */ + if (!get_current_context(td, &sf.uc)) + return 1; + + resv_sz = GET_SF_RESV_SIZE(sf); + head = get_header(head, SVE_MAGIC, resv_sz, &offset); + if (!head) { + fprintf(stderr, "No SVE context\n"); + return 1; + } + + if (head->size != sizeof(struct sve_context)) { + fprintf(stderr, "Register data present, aborting\n"); + return 1; + } + + sve = (struct sve_context *)head; + + /* No changes are supported; init left us at minimum VL so go to max */ + fprintf(stderr, "Attempting to change VL from %d to %d\n", + sve->vl, vls[0]); + sve->vl = vls[0]; + + fake_sigreturn(&sf, sizeof(sf), 0); + + return 1; +} + +struct tdescr tde = { + .name = "FAKE_SIGRETURN_SSVE_CHANGE", + .descr = "Attempt to change Streaming SVE VL", + .feats_required = FEAT_SME, + .sig_ok = SIGSEGV, + .timeout = 3, + .init = sme_get_vls, + .run = fake_sigreturn_ssve_change_vl, +}; diff --git a/tools/testing/selftests/arm64/signal/testcases/sme_trap_no_sm.c b/tools/testing/selftests/arm64/signal/testcases/sme_trap_no_sm.c new file mode 100644 index 000000000000..f9d76ae32bba --- /dev/null +++ b/tools/testing/selftests/arm64/signal/testcases/sme_trap_no_sm.c @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 ARM Limited + * + * Verify that using a streaming mode instruction without enabling it + * generates a SIGILL. + */ + +#include +#include +#include + +#include "test_signals_utils.h" +#include "testcases.h" + +int sme_trap_no_sm_trigger(struct tdescr *td) +{ + /* SMSTART ZA ; ADDHA ZA0.S, P0/M, P0/M, Z0.S */ + asm volatile(".inst 0xd503457f ; .inst 0xc0900000"); + + return 0; +} + +int sme_trap_no_sm_run(struct tdescr *td, siginfo_t *si, ucontext_t *uc) +{ + return 1; +} + +struct tdescr tde = { + .name = "SME trap without SM", + .descr = "Check that we get a SIGILL if we use streaming mode without enabling it", + .timeout = 3, + .feats_required = FEAT_SME, /* We need a SMSTART ZA */ + .sanity_disabled = true, + .trigger = sme_trap_no_sm_trigger, + .run = sme_trap_no_sm_run, + .sig_ok = SIGILL, +}; diff --git a/tools/testing/selftests/arm64/signal/testcases/sme_trap_non_streaming.c b/tools/testing/selftests/arm64/signal/testcases/sme_trap_non_streaming.c new file mode 100644 index 000000000000..e469ae5348e3 --- /dev/null +++ b/tools/testing/selftests/arm64/signal/testcases/sme_trap_non_streaming.c @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 ARM Limited + * + * Verify that using an instruction not supported in streaming mode + * traps when in streaming mode. + */ + +#include +#include +#include + +#include "test_signals_utils.h" +#include "testcases.h" + +int sme_trap_non_streaming_trigger(struct tdescr *td) +{ + /* + * The framework will handle SIGILL so we need to exit SM to + * stop any other code triggering a further SIGILL down the + * line from using a streaming-illegal instruction. + */ + asm volatile(".inst 0xd503437f; /* SMSTART ZA */ \ + cnt v0.16b, v0.16b; \ + .inst 0xd503447f /* SMSTOP ZA */"); + + return 0; +} + +int sme_trap_non_streaming_run(struct tdescr *td, siginfo_t *si, ucontext_t *uc) +{ + return 1; +} + +struct tdescr tde = { + .name = "SME SM trap unsupported instruction", + .descr = "Check that we get a SIGILL if we use an unsupported instruction in streaming mode", + .feats_required = FEAT_SME, + .feats_incompatible = FEAT_SME_FA64, + .timeout = 3, + .sanity_disabled = true, + .trigger = sme_trap_non_streaming_trigger, + .run = sme_trap_non_streaming_run, + .sig_ok = SIGILL, +}; diff --git a/tools/testing/selftests/arm64/signal/testcases/sme_trap_za.c b/tools/testing/selftests/arm64/signal/testcases/sme_trap_za.c new file mode 100644 index 000000000000..3a7747af4715 --- /dev/null +++ b/tools/testing/selftests/arm64/signal/testcases/sme_trap_za.c @@ -0,0 +1,36 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 ARM Limited + * + * Verify that accessing ZA without enabling it generates a SIGILL. + */ + +#include +#include +#include + +#include "test_signals_utils.h" +#include "testcases.h" + +int sme_trap_za_trigger(struct tdescr *td) +{ + /* ZERO ZA */ + asm volatile(".inst 0xc00800ff"); + + return 0; +} + +int sme_trap_za_run(struct tdescr *td, siginfo_t *si, ucontext_t *uc) +{ + return 1; +} + +struct tdescr tde = { + .name = "SME ZA trap", + .descr = "Check that we get a SIGILL if we access ZA without enabling", + .timeout = 3, + .sanity_disabled = true, + .trigger = sme_trap_za_trigger, + .run = sme_trap_za_run, + .sig_ok = SIGILL, +}; diff --git a/tools/testing/selftests/arm64/signal/testcases/sme_vl.c b/tools/testing/selftests/arm64/signal/testcases/sme_vl.c new file mode 100644 index 000000000000..13ff3b35cbaf --- /dev/null +++ b/tools/testing/selftests/arm64/signal/testcases/sme_vl.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 ARM Limited + * + * Check that the SME vector length reported in signal contexts is the + * expected one. + */ + +#include +#include +#include + +#include "test_signals_utils.h" +#include "testcases.h" + +struct fake_sigframe sf; +unsigned int vl; + +static bool get_sme_vl(struct tdescr *td) +{ + int ret = prctl(PR_SME_GET_VL); + if (ret == -1) + return false; + + vl = ret; + + return true; +} + +static int sme_vl(struct tdescr *td, siginfo_t *si, ucontext_t *uc) +{ + size_t resv_sz, offset; + struct _aarch64_ctx *head = GET_SF_RESV_HEAD(sf); + struct za_context *za; + + /* Get a signal context which should have a ZA frame in it */ + if (!get_current_context(td, &sf.uc)) + return 1; + + resv_sz = GET_SF_RESV_SIZE(sf); + head = get_header(head, ZA_MAGIC, resv_sz, &offset); + if (!head) { + fprintf(stderr, "No ZA context\n"); + return 1; + } + za = (struct za_context *)head; + + if (za->vl != vl) { + fprintf(stderr, "ZA sigframe VL %u, expected %u\n", + za->vl, vl); + return 1; + } else { + fprintf(stderr, "got expected VL %u\n", vl); + } + + td->pass = 1; + + return 0; +} + +struct tdescr tde = { + .name = "SME VL", + .descr = "Check that we get the right SME VL reported", + .feats_required = FEAT_SME, + .timeout = 3, + .init = get_sme_vl, + .run = sme_vl, +}; diff --git a/tools/testing/selftests/arm64/signal/testcases/ssve_regs.c b/tools/testing/selftests/arm64/signal/testcases/ssve_regs.c new file mode 100644 index 000000000000..44a08d43cd50 --- /dev/null +++ b/tools/testing/selftests/arm64/signal/testcases/ssve_regs.c @@ -0,0 +1,129 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 ARM Limited + * + * Verify that the streaming SVE register context in signal frames is + * set up as expected. + */ + +#include +#include +#include + +#include "test_signals_utils.h" +#include "testcases.h" + +struct fake_sigframe sf; +static unsigned int vls[SVE_VQ_MAX]; +unsigned int nvls = 0; + +static bool sme_get_vls(struct tdescr *td) +{ + int vq, vl; + + /* + * Enumerate up to SVE_VQ_MAX vector lengths + */ + for (vq = SVE_VQ_MAX; vq > 0; --vq) { + vl = prctl(PR_SVE_SET_VL, vq * 16); + if (vl == -1) + return false; + + vl &= PR_SME_VL_LEN_MASK; + + /* Skip missing VLs */ + vq = sve_vq_from_vl(vl); + + vls[nvls++] = vl; + } + + /* We need at least one VL */ + if (nvls < 1) { + fprintf(stderr, "Only %d VL supported\n", nvls); + return false; + } + + return true; +} + +static void setup_ssve_regs(void) +{ + /* SMSTART SM */ + asm volatile(".inst 0x7f4303d5"); + + /* RDVL x16, #1 so we should have SVE regs; real data is TODO */ + asm volatile(".inst 0x04bf5030" : : : "x16" ); +} + +static int do_one_sme_vl(struct tdescr *td, siginfo_t *si, ucontext_t *uc, + unsigned int vl) +{ + size_t resv_sz, offset; + struct _aarch64_ctx *head = GET_SF_RESV_HEAD(sf); + struct sve_context *ssve; + + fprintf(stderr, "Testing VL %d\n", vl); + + if (prctl(PR_SME_SET_VL, vl) == -1) { + fprintf(stderr, "Failed to set VL\n"); + return 1; + } + + /* + * Get a signal context which should have a SVE frame and registers + * in it. + */ + setup_ssve_regs(); + if (!get_current_context(td, &sf.uc)) + return 1; + + resv_sz = GET_SF_RESV_SIZE(sf); + head = get_header(head, SVE_MAGIC, resv_sz, &offset); + if (!head) { + fprintf(stderr, "No SVE context\n"); + return 1; + } + + ssve = (struct sve_context *)head; + if (ssve->vl != vl) { + fprintf(stderr, "Got VL %d, expected %d\n", ssve->vl, vl); + return 1; + } + + /* The actual size validation is done in get_current_context() */ + fprintf(stderr, "Got expected size %u and VL %d\n", + head->size, ssve->vl); + + return 0; +} + +static int sme_regs(struct tdescr *td, siginfo_t *si, ucontext_t *uc) +{ + int i; + + for (i = 0; i < nvls; i++) { + /* + * TODO: the signal test helpers can't currently cope + * with signal frames bigger than struct sigcontext, + * skip VLs that will trigger that. + */ + if (vls[i] > 64) + continue; + + if (do_one_sme_vl(td, si, uc, vls[i])) + return 1; + } + + td->pass = 1; + + return 0; +} + +struct tdescr tde = { + .name = "Streaming SVE registers", + .descr = "Check that we get the right Streaming SVE registers reported", + .feats_required = FEAT_SME, + .timeout = 3, + .init = sme_get_vls, + .run = sme_regs, +}; From patchwork Wed Jan 26 15:11:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FF55C2BA4C for ; Wed, 26 Jan 2022 15:14:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233709AbiAZPOP (ORCPT ); Wed, 26 Jan 2022 10:14:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242627AbiAZPOO (ORCPT ); Wed, 26 Jan 2022 10:14:14 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8380CC06161C for ; Wed, 26 Jan 2022 07:14:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 231CB61842 for ; Wed, 26 Jan 2022 15:14:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 383FDC340EB; Wed, 26 Jan 2022 15:14:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210051; bh=Oja1/wO9gzdoi99/QIiaS7CVLDJss5+ewa1C/MDEXls=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UspBiTLX7oA/wP9GC6WdtaGZ6p6ABC3ihVdAn1wAEuQxR7031HUx/7Q6qOVW9EHQG ZnNAQxk55sPUOqZWGN3yfV/j8TAyTmQiyn3lPAWQcjqz5VxEMzvq2yvy+15B68GWzO vQAH5o8mUOBdWBJLUqPZ9UooD+n6mYI3iu/zyL9GzcEYcMbuj7OzMe/1m/6Xl79twS KGGXsriV9t85PcIK+QD+ZAh8mjX9oNRLWe6fYNhZZ+Gv7dUlLdVQGeUcA3H/cCzYja GXI9klyRdUijcueJgHzA5LJcz0tOfLi8DYSsTDNArUjkNeNEKHvVwfJGfG0a8e65LJ hCRbLPG4wxB2g== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 37/40] kselftest/arm64: Add streaming SVE to SVE ptrace tests Date: Wed, 26 Jan 2022 15:11:17 +0000 Message-Id: <20220126151120.3811248-38-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1273; h=from:subject; bh=Oja1/wO9gzdoi99/QIiaS7CVLDJss5+ewa1C/MDEXls=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSUKqGPNGZXH7z6iwfnPdNtlzhXYbjCCRcGKiz8 vwxyiyiJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFklAAKCRAk1otyXVSH0ACqB/ 9H5FbWIzMbqx0jF7effH5vEJmz1AFPrA2WB3HBOBbIaMlvmVGPsk3rjTzrJrv2YhmcvbR4SH5qWjcy iNEZuf584MdmuNQf2scMulPNuNY9rjIQu4Rr8CKZSDxN0HWjHYM+irqzzAK7vpCPjSxCJwJ/k5Rkg6 fOZ02eQLTWJorxtm9EjpzJhGjbhnxQElfc8fsuFP9Z1T2EVhYxIPEW12EawWFbyXbKZxanoZ/Ynk3Z 83C7JnWCoxNckdqvUcuZ08P/GFqF9FLcAl9s3wU6mHTHWthinLymZlSX1F7WKghIORkiBoLQ8wya+/ qWlaHX0+sSFYwaezFs0GYsoiSow8WU X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org In order to allow ptrace of streaming mode SVE registers we have added a new regset for streaming mode which in isolation offers the same ABI as regular SVE with a different vector type. Add this to the array of regsets we handle, together with additional tests for the interoperation of the two regsets. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/sve-ptrace.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/tools/testing/selftests/arm64/fp/sve-ptrace.c b/tools/testing/selftests/arm64/fp/sve-ptrace.c index 90ba1d6a6781..fb74b11bab71 100644 --- a/tools/testing/selftests/arm64/fp/sve-ptrace.c +++ b/tools/testing/selftests/arm64/fp/sve-ptrace.c @@ -26,6 +26,10 @@ #define NT_ARM_SVE 0x405 #endif +#ifndef NT_ARM_SSVE +#define NT_ARM_SSVE 0x40b +#endif + struct vec_type { const char *name; unsigned long hwcap_type; @@ -42,6 +46,13 @@ static const struct vec_type vec_types[] = { .regset = NT_ARM_SVE, .prctl_set = PR_SVE_SET_VL, }, + { + .name = "Streaming SVE", + .hwcap_type = AT_HWCAP2, + .hwcap = HWCAP2_SME, + .regset = NT_ARM_SSVE, + .prctl_set = PR_SME_SET_VL, + }, }; #define VL_TESTS (((SVE_VQ_MAX - SVE_VQ_MIN) + 1) * 3) From patchwork Wed Jan 26 15:11:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D564DC2BA4C for ; Wed, 26 Jan 2022 15:14:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242628AbiAZPOT (ORCPT ); Wed, 26 Jan 2022 10:14:19 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:42122 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242627AbiAZPOT (ORCPT ); Wed, 26 Jan 2022 10:14:19 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 27CB9B81CA7 for ; Wed, 26 Jan 2022 15:14:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22CE2C340E9; Wed, 26 Jan 2022 15:14:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210056; bh=Or9AVgLUKdbiSWN+8SU2UGG+u6koLwhaCiyfY1RVxtQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=k+t/GgsdVC3WqchMLptbAG2+e6HYmcCsoEQbzbYvaONKhzfl9dyeDlTmghPUE3bxY GsRwiRZWjAOfODMKF6gEkIpTvOeeS/IOP/cXC2yAu3bRJOKitLFV6wf6cNMCxNycKp QRGoGu+lu+fyvNTtrKiRJfFnFoJbCPHs5ejS5T6dEhLmOUafjhvEh+7/rpBUX6IRqi Ffn2QYCkhPkLmRcU2AsDVpEPD+XsEt6gU5DtowGEKLC9QJAWLPOypHN9zlYp5RwPrb mSfbam+9a8tczrx2TkagWpM9NicAOPFwEHc457rpQ0ygA3O180bi0ESyudwvI9gtGQ EB/v+xL9AGo8w== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 38/40] kselftest/arm64: Add coverage for the ZA ptrace interface Date: Wed, 26 Jan 2022 15:11:18 +0000 Message-Id: <20220126151120.3811248-39-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=10064; h=from:subject; bh=Or9AVgLUKdbiSWN+8SU2UGG+u6koLwhaCiyfY1RVxtQ=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSVzhOmKd+a7ZXnxvpQMJJSUSKWZk8zhhnb3lko O5oOt8uJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFklQAKCRAk1otyXVSH0MO2B/ 9YkKgKV6hVU/R2Xh1VrxM0fMbkV/3tHlJnPZv8fV/g/A+dbE+39O6wCt+NCh0fmZLX35lKi30pX/R1 h/H3n41SfIi20Wl1k9jFonN5W6X7VVxs+qf2syZaerzwd+xCgR8/p6ty8fENmg99z3lD+x4UTz4Kjf EnIja/a7aWjH2s4myClaLDMbBynWSvoOvqabd/xgrKeTVCUcxiM28IL3QcDI6xfn2Qe9ziNuf4tSf1 dY2YAlJ2GxE/jz/7tAhviK82aqu0G4KSUlVqly0/6+7wbwprlmdjjWewEPY09bplOyUXN6b3w4AxfO lZJMFNOEKmUuFAAnh9pvJyYmLFmV6c X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add some basic coverage for the ZA ptrace interface, including walking through all the vector lengths supported in the system. As with SVE we do not currently validate the data all the way through to the running process. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/.gitignore | 1 + tools/testing/selftests/arm64/fp/Makefile | 3 +- tools/testing/selftests/arm64/fp/za-ptrace.c | 354 +++++++++++++++++++ 3 files changed, 357 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/arm64/fp/za-ptrace.c diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore index ead3197e720b..d98d3d48b504 100644 --- a/tools/testing/selftests/arm64/fp/.gitignore +++ b/tools/testing/selftests/arm64/fp/.gitignore @@ -8,4 +8,5 @@ sve-test ssve-test vec-syscfg vlset +za-ptrace za-test diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile index 38d2d0d5a0eb..807a8faf8d57 100644 --- a/tools/testing/selftests/arm64/fp/Makefile +++ b/tools/testing/selftests/arm64/fp/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 CFLAGS += -I../../../../../usr/include/ -TEST_GEN_PROGS := sve-ptrace sve-probe-vls vec-syscfg +TEST_GEN_PROGS := sve-ptrace sve-probe-vls vec-syscfg za-ptrace TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \ rdvl-sme rdvl-sve \ sve-test sve-stress \ @@ -27,5 +27,6 @@ vec-syscfg: vec-syscfg.o rdvl.o vlset: vlset.o za-test: za-test.o asm-utils.o $(CC) -nostdlib $^ -o $@ +za-ptrace: za-ptrace.o include ../../lib.mk diff --git a/tools/testing/selftests/arm64/fp/za-ptrace.c b/tools/testing/selftests/arm64/fp/za-ptrace.c new file mode 100644 index 000000000000..692c82624855 --- /dev/null +++ b/tools/testing/selftests/arm64/fp/za-ptrace.c @@ -0,0 +1,354 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2021 ARM Limited. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../../kselftest.h" + +/* and don't like each other, so: */ +#ifndef NT_ARM_ZA +#define NT_ARM_ZA 0x40c +#endif + +#define EXPECTED_TESTS (((SVE_VQ_MAX - SVE_VQ_MIN) + 1) * 3) + +static void fill_buf(char *buf, size_t size) +{ + int i; + + for (i = 0; i < size; i++) + buf[i] = random(); +} + +static int do_child(void) +{ + if (ptrace(PTRACE_TRACEME, -1, NULL, NULL)) + ksft_exit_fail_msg("PTRACE_TRACEME", strerror(errno)); + + if (raise(SIGSTOP)) + ksft_exit_fail_msg("raise(SIGSTOP)", strerror(errno)); + + return EXIT_SUCCESS; +} + +static struct user_za_header *get_za(pid_t pid, void **buf, size_t *size) +{ + struct user_za_header *za; + void *p; + size_t sz = sizeof(*za); + struct iovec iov; + + while (1) { + if (*size < sz) { + p = realloc(*buf, sz); + if (!p) { + errno = ENOMEM; + goto error; + } + + *buf = p; + *size = sz; + } + + iov.iov_base = *buf; + iov.iov_len = sz; + if (ptrace(PTRACE_GETREGSET, pid, NT_ARM_ZA, &iov)) + goto error; + + za = *buf; + if (za->size <= sz) + break; + + sz = za->size; + } + + return za; + +error: + return NULL; +} + +static int set_za(pid_t pid, const struct user_za_header *za) +{ + struct iovec iov; + + iov.iov_base = (void *)za; + iov.iov_len = za->size; + return ptrace(PTRACE_SETREGSET, pid, NT_ARM_ZA, &iov); +} + +/* Validate attempting to set the specfied VL via ptrace */ +static void ptrace_set_get_vl(pid_t child, unsigned int vl, bool *supported) +{ + struct user_za_header za; + struct user_za_header *new_za = NULL; + size_t new_za_size = 0; + int ret, prctl_vl; + + *supported = false; + + /* Check if the VL is supported in this process */ + prctl_vl = prctl(PR_SME_SET_VL, vl); + if (prctl_vl == -1) + ksft_exit_fail_msg("prctl(PR_SME_SET_VL) failed: %s (%d)\n", + strerror(errno), errno); + + /* If the VL is not supported then a supported VL will be returned */ + *supported = (prctl_vl == vl); + + /* Set the VL by doing a set with no register payload */ + memset(&za, 0, sizeof(za)); + za.size = sizeof(za); + za.vl = vl; + ret = set_za(child, &za); + if (ret != 0) { + ksft_test_result_fail("Failed to set VL %u\n", vl); + return; + } + + /* + * Read back the new register state and verify that we have the + * same VL that we got from prctl() on ourselves. + */ + if (!get_za(child, (void **)&new_za, &new_za_size)) { + ksft_test_result_fail("Failed to read VL %u\n", vl); + return; + } + + ksft_test_result(new_za->vl = prctl_vl, "Set VL %u\n", vl); + + free(new_za); +} + +/* Validate attempting to set no ZA data and read it back */ +static void ptrace_set_no_data(pid_t child, unsigned int vl) +{ + void *read_buf = NULL; + struct user_za_header write_za; + struct user_za_header *read_za; + size_t read_za_size = 0; + int ret; + + /* Set up some data and write it out */ + memset(&write_za, 0, sizeof(write_za)); + write_za.size = ZA_PT_ZA_OFFSET; + write_za.vl = vl; + + ret = set_za(child, &write_za); + if (ret != 0) { + ksft_test_result_fail("Failed to set VL %u no data\n", vl); + return; + } + + /* Read the data back */ + if (!get_za(child, (void **)&read_buf, &read_za_size)) { + ksft_test_result_fail("Failed to read VL %u no data\n", vl); + return; + } + read_za = read_buf; + + /* We might read more data if there's extensions we don't know */ + if (read_za->size < write_za.size) { + ksft_test_result_fail("VL %u wrote %d bytes, only read %d\n", + vl, write_za.size, read_za->size); + goto out_read; + } + + ksft_test_result(read_za->size == write_za.size, + "Disabled ZA for VL %u\n", vl); + +out_read: + free(read_buf); +} + +/* Validate attempting to set data and read it back */ +static void ptrace_set_get_data(pid_t child, unsigned int vl) +{ + void *write_buf; + void *read_buf = NULL; + struct user_za_header *write_za; + struct user_za_header *read_za; + size_t read_za_size = 0; + unsigned int vq = sve_vq_from_vl(vl); + int ret; + size_t data_size; + + data_size = ZA_PT_SIZE(vq); + write_buf = malloc(data_size); + if (!write_buf) { + ksft_test_result_fail("Error allocating %d byte buffer for VL %u\n", + data_size, vl); + return; + } + write_za = write_buf; + + /* Set up some data and write it out */ + memset(write_za, 0, data_size); + write_za->size = data_size; + write_za->vl = vl; + + fill_buf(write_buf + ZA_PT_ZA_OFFSET, ZA_PT_ZA_SIZE(vq)); + + ret = set_za(child, write_za); + if (ret != 0) { + ksft_test_result_fail("Failed to set VL %u data\n", vl); + goto out; + } + + /* Read the data back */ + if (!get_za(child, (void **)&read_buf, &read_za_size)) { + ksft_test_result_fail("Failed to read VL %u data\n", vl); + goto out; + } + read_za = read_buf; + + /* We might read more data if there's extensions we don't know */ + if (read_za->size < write_za->size) { + ksft_test_result_fail("VL %u wrote %d bytes, only read %d\n", + vl, write_za->size, read_za->size); + goto out_read; + } + + ksft_test_result(memcmp(write_buf + ZA_PT_ZA_OFFSET, + read_buf + ZA_PT_ZA_OFFSET, + ZA_PT_ZA_SIZE(vq)) == 0, + "Data match for VL %u\n", vl); + +out_read: + free(read_buf); +out: + free(write_buf); +} + +static int do_parent(pid_t child) +{ + int ret = EXIT_FAILURE; + pid_t pid; + int status; + siginfo_t si; + unsigned int vq, vl; + bool vl_supported; + + /* Attach to the child */ + while (1) { + int sig; + + pid = wait(&status); + if (pid == -1) { + perror("wait"); + goto error; + } + + /* + * This should never happen but it's hard to flag in + * the framework. + */ + if (pid != child) + continue; + + if (WIFEXITED(status) || WIFSIGNALED(status)) + ksft_exit_fail_msg("Child died unexpectedly\n"); + + if (!WIFSTOPPED(status)) + goto error; + + sig = WSTOPSIG(status); + + if (ptrace(PTRACE_GETSIGINFO, pid, NULL, &si)) { + if (errno == ESRCH) + goto disappeared; + + if (errno == EINVAL) { + sig = 0; /* bust group-stop */ + goto cont; + } + + ksft_test_result_fail("PTRACE_GETSIGINFO: %s\n", + strerror(errno)); + goto error; + } + + if (sig == SIGSTOP && si.si_code == SI_TKILL && + si.si_pid == pid) + break; + + cont: + if (ptrace(PTRACE_CONT, pid, NULL, sig)) { + if (errno == ESRCH) + goto disappeared; + + ksft_test_result_fail("PTRACE_CONT: %s\n", + strerror(errno)); + goto error; + } + } + + /* Step through every possible VQ */ + for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; vq++) { + vl = sve_vl_from_vq(vq); + + /* First, try to set this vector length */ + ptrace_set_get_vl(child, vl, &vl_supported); + + /* If the VL is supported validate data set/get */ + if (vl_supported) { + ptrace_set_no_data(child, vl); + ptrace_set_get_data(child, vl); + } else { + ksft_test_result_skip("Disabled ZA for VL %u\n", vl); + ksft_test_result_skip("Get and set data for VL %u\n", + vl); + } + } + + ret = EXIT_SUCCESS; + +error: + kill(child, SIGKILL); + +disappeared: + return ret; +} + +int main(void) +{ + int ret = EXIT_SUCCESS; + pid_t child; + + srandom(getpid()); + + ksft_print_header(); + + if (!(getauxval(AT_HWCAP2) & HWCAP2_SME)) { + ksft_set_plan(1); + ksft_exit_skip("SME not available\n"); + } + + ksft_set_plan(EXPECTED_TESTS); + + child = fork(); + if (!child) + return do_child(); + + if (do_parent(child)) + ret = EXIT_FAILURE; + + ksft_print_cnts(); + + return ret; +} From patchwork Wed Jan 26 15:11:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50BD0C28CF5 for ; Wed, 26 Jan 2022 15:14:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242629AbiAZPOY (ORCPT ); Wed, 26 Jan 2022 10:14:24 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:42168 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235870AbiAZPOY (ORCPT ); Wed, 26 Jan 2022 10:14:24 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 05395B81E15 for ; Wed, 26 Jan 2022 15:14:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 884F4C340EB; Wed, 26 Jan 2022 15:14:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210061; bh=/IsDiGLrik2jwYTUfwMASyDcTKTPn/KCUYF6KIYhml8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NrRoxuMlNc6I7sjcArBUCte4c7xCdqhOrO87XMHnILX+i9C+MxJsK/VL4SzuC3uFE sDRLnzGtL+tKqzo+Pb0PNhzMPR8H9AdRS60fdLYRZRqtzWbnb9AUquFmkMmphvGQn/ oXA6wYSCu+8j9A4sYDU1IuYeYzSSBBcNk35+oKO/80F65xEdS2EmukfzFtFqhNoIXQ GzKGLEwSKdeHTtWyyc/Tf9pmIBrqeLzfiOxLs/xCLFGq0ctH91YpXkVqKCmNk6oIdT bL3GrJ+MFYDaiyR4cqG6p/+UaDqjviGCrHgVT2cPbF8AJqYDopIZcO+KalDofzBxm6 qLvGG29GyZp3A== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 39/40] kselftest/arm64: Add SME support to syscall ABI test Date: Wed, 26 Jan 2022 15:11:19 +0000 Message-Id: <20220126151120.3811248-40-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=14984; h=from:subject; bh=/IsDiGLrik2jwYTUfwMASyDcTKTPn/KCUYF6KIYhml8=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSWeJVDEKktOGfFvufTKaqV0EDA+nJC+unOKEej MYJ6Fv2JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFklgAKCRAk1otyXVSH0Ls3B/ 90pjcA8JJfVbH0eqJWVXhHKyOLXI6pZ/XVJa3eIo0FfggIMUmAXqSuYvoyHvN7M6zs6uerpEfFaSUi VBY0YWPL1WA3YbY2zR2bCtRkmUNueQkuzFBE3uq2Hqwk0DUhrsHnhZRpyS8T3ra5I0C5TJonV/q42z dl0B5c9nCRIkrjNRrZNtlRssE82QyD9JDQT1kviS+UVRAoIDa91Tun6oc5ADNcN23eW6tk+TciiKGT gd55CiRGBqyqYj9b4MchS7s9P4rMKDrnN43VcavPWyo2p81iFlEeSOAy4sUSA5JKACeczOVQr2iB64 XdpWNX59efxMJ1GTXwVTLu1qLUvAVL X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org For every possible combination of SVE and SME vector length verify that for each possible value of SVCR after a syscall we leave streaming mode and ZA is preserved. We don't need to take account of any streaming/non streaming SVE vector length changes in the assembler code since the store instructions will handle the vector length for us. We log if the system supports FA64 and only try to set FFR in streaming mode if it does. Signed-off-by: Mark Brown --- .../selftests/arm64/abi/syscall-abi-asm.S | 69 +++++- .../testing/selftests/arm64/abi/syscall-abi.c | 204 ++++++++++++++++-- .../testing/selftests/arm64/abi/syscall-abi.h | 15 ++ 3 files changed, 265 insertions(+), 23 deletions(-) create mode 100644 tools/testing/selftests/arm64/abi/syscall-abi.h diff --git a/tools/testing/selftests/arm64/abi/syscall-abi-asm.S b/tools/testing/selftests/arm64/abi/syscall-abi-asm.S index 983467cfcee0..bc70e04224bf 100644 --- a/tools/testing/selftests/arm64/abi/syscall-abi-asm.S +++ b/tools/testing/selftests/arm64/abi/syscall-abi-asm.S @@ -9,15 +9,42 @@ // invoked is configured in x8 of the input GPR data. // // x0: SVE VL, 0 for FP only +// x1: SME VL // // GPRs: gpr_in, gpr_out // FPRs: fpr_in, fpr_out // Zn: z_in, z_out // Pn: p_in, p_out // FFR: ffr_in, ffr_out +// ZA: za_in, za_out +// SVCR: svcr_in, svcr_out + +#include "syscall-abi.h" .arch_extension sve +/* + * LDR (vector to ZA array): + * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL] + */ +.macro _ldr_za nw, nxbase, offset=0 + .inst 0xe1000000 \ + | (((\nw) & 3) << 13) \ + | ((\nxbase) << 5) \ + | ((\offset) & 7) +.endm + +/* + * STR (vector from ZA array): + * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL] + */ +.macro _str_za nw, nxbase, offset=0 + .inst 0xe1200000 \ + | (((\nw) & 3) << 13) \ + | ((\nxbase) << 5) \ + | ((\offset) & 7) +.endm + .globl do_syscall do_syscall: // Store callee saved registers x19-x29 (80 bytes) plus x0 and x1 @@ -30,6 +57,24 @@ do_syscall: stp x25, x26, [sp, #80] stp x27, x28, [sp, #96] + // Set SVCR if we're doing SME + cbz x1, 1f + adrp x2, svcr_in + ldr x2, [x2, :lo12:svcr_in] + msr S3_3_C4_C2_2, x2 +1: + + // Load ZA if it's enabled - uses x12 as scratch due to SME LDR + tbz x2, #SVCR_ZA_SHIFT, 1f + mov w12, #0 + ldr x2, =za_in +2: _ldr_za 12, 2 + add x2, x2, x1 + add x12, x12, #1 + cmp x1, x12 + bne 2b +1: + // Load GPRs x8-x28, and save our SP/FP for later comparison ldr x2, =gpr_in add x2, x2, #64 @@ -68,7 +113,7 @@ do_syscall: ldp q30, q31, [x2, #16 * 30] 1: - // Load the SVE registers if we're doing SVE + // Load the SVE registers if we're doing SVE/SME cbz x0, 1f ldr x2, =z_in @@ -105,9 +150,13 @@ do_syscall: ldr z30, [x2, #30, MUL VL] ldr z31, [x2, #31, MUL VL] + // Only set a non-zero FFR, test patterns must be zero since the + // syscall should clear it - this lets us handle FA64. ldr x2, =ffr_in + cbz x2, 2f ldr p0, [x2, #0] wrffr p0.b +2: ldr x2, =p_in ldr p0, [x2, #0, MUL VL] @@ -169,6 +218,24 @@ do_syscall: stp q28, q29, [x2, #16 * 28] stp q30, q31, [x2, #16 * 30] + // Save SVCR if we're doing SME + cbz x1, 1f + mrs x2, S3_3_C4_C2_2 + adrp x3, svcr_out + str x2, [x3, :lo12:svcr_out] +1: + + // Save ZA if it's enabled - uses x12 as scratch due to SME STR + tbz x2, #SVCR_ZA_SHIFT, 1f + mov w12, #0 + ldr x2, =za_out +2: _str_za 12, 2 + add x2, x2, x1 + add x12, x12, #1 + cmp x1, x12 + bne 2b +1: + // Save the SVE state if we have some cbz x0, 1f diff --git a/tools/testing/selftests/arm64/abi/syscall-abi.c b/tools/testing/selftests/arm64/abi/syscall-abi.c index 1e13b7523918..b632bfe9e022 100644 --- a/tools/testing/selftests/arm64/abi/syscall-abi.c +++ b/tools/testing/selftests/arm64/abi/syscall-abi.c @@ -18,9 +18,13 @@ #include "../../kselftest.h" +#include "syscall-abi.h" + #define NUM_VL ((SVE_VQ_MAX - SVE_VQ_MIN) + 1) -extern void do_syscall(int sve_vl); +static int default_sme_vl; + +extern void do_syscall(int sve_vl, int sme_vl); static void fill_random(void *buf, size_t size) { @@ -48,14 +52,15 @@ static struct syscall_cfg { uint64_t gpr_in[NUM_GPR]; uint64_t gpr_out[NUM_GPR]; -static void setup_gpr(struct syscall_cfg *cfg, int sve_vl) +static void setup_gpr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { fill_random(gpr_in, sizeof(gpr_in)); gpr_in[8] = cfg->syscall_nr; memset(gpr_out, 0, sizeof(gpr_out)); } -static int check_gpr(struct syscall_cfg *cfg, int sve_vl) +static int check_gpr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, uint64_t svcr) { int errors = 0; int i; @@ -79,13 +84,15 @@ static int check_gpr(struct syscall_cfg *cfg, int sve_vl) uint64_t fpr_in[NUM_FPR * 2]; uint64_t fpr_out[NUM_FPR * 2]; -static void setup_fpr(struct syscall_cfg *cfg, int sve_vl) +static void setup_fpr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { fill_random(fpr_in, sizeof(fpr_in)); memset(fpr_out, 0, sizeof(fpr_out)); } -static int check_fpr(struct syscall_cfg *cfg, int sve_vl) +static int check_fpr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { int errors = 0; int i; @@ -109,13 +116,15 @@ static uint8_t z_zero[__SVE_ZREG_SIZE(SVE_VQ_MAX)]; uint8_t z_in[SVE_NUM_PREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)]; uint8_t z_out[SVE_NUM_PREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)]; -static void setup_z(struct syscall_cfg *cfg, int sve_vl) +static void setup_z(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { fill_random(z_in, sizeof(z_in)); fill_random(z_out, sizeof(z_out)); } -static int check_z(struct syscall_cfg *cfg, int sve_vl) +static int check_z(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { size_t reg_size = sve_vl; int errors = 0; @@ -126,13 +135,17 @@ static int check_z(struct syscall_cfg *cfg, int sve_vl) /* * After a syscall the low 128 bits of the Z registers should - * be preserved and the rest be zeroed or preserved. + * be preserved and the rest be zeroed or preserved, except if + * we were in streaming mode in which case the low 128 bits may + * also be cleared by the transition out of streaming mode. */ for (i = 0; i < SVE_NUM_ZREGS; i++) { void *in = &z_in[reg_size * i]; void *out = &z_out[reg_size * i]; - if (memcmp(in, out, SVE_VQ_BYTES) != 0) { + if ((memcmp(in, out, SVE_VQ_BYTES) != 0) && + !((svcr & SVCR_SM_MASK) && + memcmp(z_zero, out, SVE_VQ_BYTES) == 0)) { ksft_print_msg("%s SVE VL %d Z%d low 128 bits changed\n", cfg->name, sve_vl, i); errors++; @@ -145,13 +158,15 @@ static int check_z(struct syscall_cfg *cfg, int sve_vl) uint8_t p_in[SVE_NUM_PREGS * __SVE_PREG_SIZE(SVE_VQ_MAX)]; uint8_t p_out[SVE_NUM_PREGS * __SVE_PREG_SIZE(SVE_VQ_MAX)]; -static void setup_p(struct syscall_cfg *cfg, int sve_vl) +static void setup_p(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { fill_random(p_in, sizeof(p_in)); fill_random(p_out, sizeof(p_out)); } -static int check_p(struct syscall_cfg *cfg, int sve_vl) +static int check_p(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { size_t reg_size = sve_vq_from_vl(sve_vl) * 2; /* 1 bit per VL byte */ @@ -175,8 +190,19 @@ static int check_p(struct syscall_cfg *cfg, int sve_vl) uint8_t ffr_in[__SVE_PREG_SIZE(SVE_VQ_MAX)]; uint8_t ffr_out[__SVE_PREG_SIZE(SVE_VQ_MAX)]; -static void setup_ffr(struct syscall_cfg *cfg, int sve_vl) +static void setup_ffr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { + /* + * If we are in streaming mode and do not have FA64 then FFR + * is unavailable. + */ + if ((svcr & SVCR_SM_MASK) && + !(getauxval(AT_HWCAP2) & HWCAP2_SME_FA64)) { + memset(&ffr_in, 0, sizeof(ffr_in)); + return; + } + /* * It is only valid to set a contiguous set of bits starting * at 0. For now since we're expecting this to be cleared by @@ -186,7 +212,8 @@ static void setup_ffr(struct syscall_cfg *cfg, int sve_vl) fill_random(ffr_out, sizeof(ffr_out)); } -static int check_ffr(struct syscall_cfg *cfg, int sve_vl) +static int check_ffr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { size_t reg_size = sve_vq_from_vl(sve_vl) * 2; /* 1 bit per VL byte */ int errors = 0; @@ -195,6 +222,10 @@ static int check_ffr(struct syscall_cfg *cfg, int sve_vl) if (!sve_vl) return 0; + if ((svcr & SVCR_SM_MASK) && + !(getauxval(AT_HWCAP2) & HWCAP2_SME_FA64)) + return 0; + /* After a syscall the P registers should be preserved or zeroed */ for (i = 0; i < reg_size; i++) if (ffr_out[i] && (ffr_in[i] != ffr_out[i])) @@ -206,8 +237,65 @@ static int check_ffr(struct syscall_cfg *cfg, int sve_vl) return errors; } -typedef void (*setup_fn)(struct syscall_cfg *cfg, int sve_vl); -typedef int (*check_fn)(struct syscall_cfg *cfg, int sve_vl); +uint64_t svcr_in, svcr_out; + +static void setup_svcr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) +{ + svcr_in = svcr; +} + +static int check_svcr(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) +{ + int errors = 0; + + if (svcr_out & SVCR_SM_MASK) { + ksft_print_msg("%s Still in SM, SVCR %llx\n", + cfg->name, svcr_out); + errors++; + } + + if ((svcr_in & SVCR_ZA_MASK) != (svcr_out & SVCR_ZA_MASK)) { + ksft_print_msg("%s PSTATE.ZA changed, SVCR %llx != %llx\n", + cfg->name, svcr_in, svcr_out); + errors++; + } + + return errors; +} + +uint8_t za_in[SVE_NUM_PREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)]; +uint8_t za_out[SVE_NUM_PREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)]; + +static void setup_za(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) +{ + fill_random(za_in, sizeof(za_in)); + memset(za_out, 0, sizeof(za_out)); +} + +static int check_za(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) +{ + size_t reg_size = sme_vl * sme_vl; + int errors = 0; + + if (!(svcr & SVCR_ZA_MASK)) + return 0; + + if (memcmp(za_in, za_out, reg_size) != 0) { + ksft_print_msg("SME VL %d ZA does not match\n", sme_vl); + errors++; + } + + return errors; +} + +typedef void (*setup_fn)(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr); +typedef int (*check_fn)(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr); /* * Each set of registers has a setup function which is called before @@ -225,20 +313,23 @@ static struct { { setup_z, check_z }, { setup_p, check_p }, { setup_ffr, check_ffr }, + { setup_svcr, check_svcr }, + { setup_za, check_za }, }; -static bool do_test(struct syscall_cfg *cfg, int sve_vl) +static bool do_test(struct syscall_cfg *cfg, int sve_vl, int sme_vl, + uint64_t svcr) { int errors = 0; int i; for (i = 0; i < ARRAY_SIZE(regset); i++) - regset[i].setup(cfg, sve_vl); + regset[i].setup(cfg, sve_vl, sme_vl, svcr); - do_syscall(sve_vl); + do_syscall(sve_vl, sme_vl); for (i = 0; i < ARRAY_SIZE(regset); i++) - errors += regset[i].check(cfg, sve_vl); + errors += regset[i].check(cfg, sve_vl, sme_vl, svcr); return errors == 0; } @@ -246,9 +337,10 @@ static bool do_test(struct syscall_cfg *cfg, int sve_vl) static void test_one_syscall(struct syscall_cfg *cfg) { int sve_vq, sve_vl; + int sme_vq, sme_vl; /* FPSIMD only case */ - ksft_test_result(do_test(cfg, 0), + ksft_test_result(do_test(cfg, 0, default_sme_vl, 0), "%s FPSIMD\n", cfg->name); if (!(getauxval(AT_HWCAP) & HWCAP_SVE)) @@ -265,8 +357,36 @@ static void test_one_syscall(struct syscall_cfg *cfg) if (sve_vq != sve_vq_from_vl(sve_vl)) sve_vq = sve_vq_from_vl(sve_vl); - ksft_test_result(do_test(cfg, sve_vl), + ksft_test_result(do_test(cfg, sve_vl, default_sme_vl, 0), "%s SVE VL %d\n", cfg->name, sve_vl); + + if (!(getauxval(AT_HWCAP2) & HWCAP2_SME)) + continue; + + for (sme_vq = SVE_VQ_MAX; sme_vq > 0; --sme_vq) { + sme_vl = prctl(PR_SME_SET_VL, sme_vq * 16); + if (sme_vl == -1) + ksft_exit_fail_msg("PR_SME_SET_VL failed: %s (%d)\n", + strerror(errno), errno); + + sme_vl &= PR_SME_VL_LEN_MASK; + + if (sme_vq != sve_vq_from_vl(sme_vl)) + sme_vq = sve_vq_from_vl(sme_vl); + + ksft_test_result(do_test(cfg, sve_vl, sme_vl, + SVCR_ZA_MASK | SVCR_SM_MASK), + "%s SVE VL %d/SME VL %d SM+ZA\n", + cfg->name, sve_vl, sme_vl); + ksft_test_result(do_test(cfg, sve_vl, sme_vl, + SVCR_SM_MASK), + "%s SVE VL %d/SME VL %d SM\n", + cfg->name, sve_vl, sme_vl); + ksft_test_result(do_test(cfg, sve_vl, sme_vl, + SVCR_ZA_MASK), + "%s SVE VL %d/SME VL %d ZA\n", + cfg->name, sve_vl, sme_vl); + } } } @@ -299,14 +419,54 @@ int sve_count_vls(void) return vl_count; } +int sme_count_vls(void) +{ + unsigned int vq; + int vl_count = 0; + int vl; + + if (!(getauxval(AT_HWCAP2) & HWCAP2_SME)) + return 0; + + /* Ensure we configure a SME VL, used to flag if SVCR is set */ + default_sme_vl = 16; + + /* + * Enumerate up to SVE_VQ_MAX vector lengths + */ + for (vq = SVE_VQ_MAX; vq > 0; --vq) { + vl = prctl(PR_SME_SET_VL, vq * 16); + if (vl == -1) + ksft_exit_fail_msg("PR_SME_SET_VL failed: %s (%d)\n", + strerror(errno), errno); + + vl &= PR_SME_VL_LEN_MASK; + + if (vq != sve_vq_from_vl(vl)) + vq = sve_vq_from_vl(vl); + + vl_count++; + } + + return vl_count; +} + int main(void) { int i; + int tests = 1; /* FPSIMD */ srandom(getpid()); ksft_print_header(); - ksft_set_plan(ARRAY_SIZE(syscalls) * (sve_count_vls() + 1)); + tests += sve_count_vls(); + tests += (sve_count_vls() * sme_count_vls()) * 3; + ksft_set_plan(ARRAY_SIZE(syscalls) * tests); + + if (getauxval(AT_HWCAP2) & HWCAP2_SME_FA64) + ksft_print_msg("SME with FA64\n"); + else if (getauxval(AT_HWCAP2) & HWCAP2_SME) + ksft_print_msg("SME without FA64\n"); for (i = 0; i < ARRAY_SIZE(syscalls); i++) test_one_syscall(&syscalls[i]); diff --git a/tools/testing/selftests/arm64/abi/syscall-abi.h b/tools/testing/selftests/arm64/abi/syscall-abi.h new file mode 100644 index 000000000000..bda5a87ad381 --- /dev/null +++ b/tools/testing/selftests/arm64/abi/syscall-abi.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2021 ARM Limited. + */ + +#ifndef SYSCALL_ABI_H +#define SYSCALL_ABI_H + +#define SVCR_ZA_MASK 2 +#define SVCR_SM_MASK 1 + +#define SVCR_ZA_SHIFT 1 +#define SVCR_SM_SHIFT 0 + +#endif From patchwork Wed Jan 26 15:11:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 537354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7589C3526D for ; Wed, 26 Jan 2022 15:14:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242662AbiAZPOd (ORCPT ); Wed, 26 Jan 2022 10:14:33 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:42214 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242627AbiAZPO1 (ORCPT ); Wed, 26 Jan 2022 10:14:27 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D718361857 for ; Wed, 26 Jan 2022 15:14:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4818BC340E8; Wed, 26 Jan 2022 15:14:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643210066; bh=copy4cRQ6puPNwPV3UaCKRsGUK05uLwxOUu+r7/XC10=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z78BLYv32FxtO975+1ZRXnyIDr1HVDi5tg9h1si/BHxCjOFnFExApxIrbNBgY1/8G 2LaXk4DreF+ZgjE3rmluBZXR/kQC7ROFrKNaPukUIOo4HS+dUUsg/1ESIh8+v/Pj01 XP0C7ML4IhpHSnU+mC9RyR/W9wdOWfMbntB/BoEGZHoBErvhoJJJJaBaLvK9Js54cu uFiK17q2ZyvkxwMnzgE0MqCRC9TAVBly4bPoneGEl8ag5oDI6qr1Ymqc5aYraXHIgp LY+k0xKH6buJicrdmx5qOvmYAi9VFZCFUJVLt3EwEj0zjkEGJPPke1K/1aPVBvi4a8 W3ZY4reYpUZ4Q== From: Mark Brown To: Catalin Marinas , Will Deacon , Marc Zyngier , Shuah Khan , Shuah Khan Cc: Alan Hayward , Luis Machado , Salil Akerkar , Basant Kumar Dwivedi , Szabolcs Nagy , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Brown Subject: [PATCH v9 40/40] squqsh traps Date: Wed, 26 Jan 2022 15:11:20 +0000 Message-Id: <20220126151120.3811248-41-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220126151120.3811248-1-broonie@kernel.org> References: <20220126151120.3811248-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4479; h=from:subject; bh=copy4cRQ6puPNwPV3UaCKRsGUK05uLwxOUu+r7/XC10=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBh8WSWpilDW9eFLR9WsG+pu8O0oF7wMm6YV2gEwgCG DNHOJqqJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCYfFklgAKCRAk1otyXVSH0CjgB/ 9L+PEoqD+XcoxnxNpfVepzo0f0TuyQ/QjuCAai209ngWlT2tkE+A7RbRVGotpHUNWdRqTru1xQJ2iR kPGJXAWAUkpDOZb8Xe661bgaYNXIEBgabR3RhuOdTH/6pwrHhO3vpg+m4LAQNRTShE/ieYIuAb2r4w 4UdbuVJElFW0YcMiMsQgXTvwJgtG8RcS7zAs1I7YYTS85+x0jPrII3e4leiZJkS9vDt31Dgd7VXps1 wl4suOMcTKs+sdlJNH0fzGVEzf0WAQryFYfIXBUpCtOu1VXDKGyrb8tIqH5Xy1arbux0iOyDdI8oAN S3q7A55s/MzSSBaWoQFUyEenBIgMQL X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org --- arch/arm64/kvm/fpsimd.c | 22 ++++++++++------------ arch/arm64/kvm/hyp/nvhe/switch.c | 20 ++++++++++---------- arch/arm64/kvm/hyp/vhe/switch.c | 4 ++-- 3 files changed, 22 insertions(+), 24 deletions(-) diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index cecaddb644ce..1c585553d74f 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -153,18 +153,16 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) * If we have VHE then the Hyp code will reset CPACR_EL1 to * CPACR_EL1_DEFAULT and we need to reenable SME. */ - if (has_vhe()) { - if (system_supports_sme()) { - /* Also restore EL0 state seen on entry */ - if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) - sysreg_clear_set(CPACR_EL1, 0, - CPACR_EL1_SMEN_EL0EN | - CPACR_EL1_SMEN_EL1EN); - else - sysreg_clear_set(CPACR_EL1, - CPACR_EL1_SMEN_EL0EN, - CPACR_EL1_SMEN_EL1EN); - } + if (has_vhe() && system_supports_sme()) { + /* Also restore EL0 state seen on entry */ + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) + sysreg_clear_set(CPACR_EL1, 0, + CPACR_EL1_SMEN_EL0EN | + CPACR_EL1_SMEN_EL1EN); + else + sysreg_clear_set(CPACR_EL1, + CPACR_EL1_SMEN_EL0EN, + CPACR_EL1_SMEN_EL1EN); } if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) { diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 184bf6bd79b9..caace61ea459 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -47,21 +47,20 @@ static void __activate_traps(struct kvm_vcpu *vcpu) val |= CPTR_EL2_TFP | CPTR_EL2_TZ; __activate_traps_fpsimd32(vcpu); } - if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + if (cpus_have_final_cap(ARM64_SME)) val |= CPTR_EL2_TSM; write_sysreg(val, cptr_el2); write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el2); - if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && - cpus_have_final_cap(ARM64_HAS_FGT)) { + if (cpus_have_final_cap(ARM64_SME)) { val = read_sysreg_s(SYS_HFGRTR_EL2); - val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | + val &= ~(HFGxTR_EL2_nTPIDR2_EL0_MASK | HFGxTR_EL2_nSMPRI_EL1_MASK); write_sysreg_s(val, SYS_HFGRTR_EL2); val = read_sysreg_s(SYS_HFGWTR_EL2); - val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | + val &= ~(HFGxTR_EL2_nTPIDR2_EL0_MASK | HFGxTR_EL2_nSMPRI_EL1_MASK); write_sysreg_s(val, SYS_HFGWTR_EL2); } @@ -109,23 +108,24 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2); - if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && - cpus_have_final_cap(ARM64_HAS_FGT)) { + if (cpus_have_final_cap(ARM64_SME)) { u64 val; val = read_sysreg_s(SYS_HFGRTR_EL2); - val |= HFGxTR_EL2_nTPIDR_EL0_MASK | HFGxTR_EL2_nSMPRI_EL1_MASK; + val |= HFGxTR_EL2_nTPIDR2_EL0_MASK | + HFGxTR_EL2_nSMPRI_EL1_MASK; write_sysreg_s(val, SYS_HFGRTR_EL2); val = read_sysreg_s(SYS_HFGWTR_EL2); - val |= HFGxTR_EL2_nTPIDR_EL0_MASK | HFGxTR_EL2_nSMPRI_EL1_MASK; + val |= HFGxTR_EL2_nTPIDR2_EL0_MASK | + HFGxTR_EL2_nSMPRI_EL1_MASK; write_sysreg_s(val, SYS_HFGWTR_EL2); } cptr = CPTR_EL2_DEFAULT; if (vcpu_has_sve(vcpu) && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)) cptr |= CPTR_EL2_TZ; - if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + if (cpus_have_final_cap(ARM64_SME)) cptr &= ~CPTR_EL2_TSM; write_sysreg(cptr, cptr_el2); diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 21217485514d..a4d2fb5c9710 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -60,7 +60,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu) __activate_traps_fpsimd32(vcpu); } - if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + if (cpus_have_final_cap(ARM64_SME)) write_sysreg(read_sysreg(sctlr_el2) & ~SCTLR_ELx_ENTP2, sctlr_el2); @@ -85,7 +85,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) */ asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); - if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) + if (cpus_have_final_cap(ARM64_SME)) write_sysreg(read_sysreg(sctlr_el2) | SCTLR_ELx_ENTP2, sctlr_el2);