From patchwork Wed Mar 31 19:13:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 413175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5465C4360C for ; Wed, 31 Mar 2021 19:15:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B8C961056 for ; Wed, 31 Mar 2021 19:15:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235727AbhCaTOm (ORCPT ); Wed, 31 Mar 2021 15:14:42 -0400 Received: from mga12.intel.com ([192.55.52.136]:51675 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234311AbhCaTOO (ORCPT ); Wed, 31 Mar 2021 15:14:14 -0400 IronPort-SDR: mlN8SZoR10QIXioepSPW4eUIEjHxz8Zbg3SA9SZUllnDeZNdbjUadNVzm6eZhTaTs0SOzL4ys1 7zMPYzypMn+w== X-IronPort-AV: E=McAfee;i="6000,8403,9940"; a="171490585" X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="171490585" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:14 -0700 IronPort-SDR: e1jO6GGy3/aVHdiHR+ff9CEuZ7Bo9QoPwyTWeDv50fAePIGZ//IKj5sp6bc8et+gqTsfcEcu86 XvyJU38QgMXA== X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="610625372" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:13 -0700 From: ira.weiny@intel.com To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra Cc: Ira Weiny , Dan Williams , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, Fenghua Yu , linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH V5 01/10] x86/pkeys: Create pkeys_common.h Date: Wed, 31 Mar 2021 12:13:56 -0700 Message-Id: <20210331191405.341999-2-ira.weiny@intel.com> X-Mailer: git-send-email 2.28.0.rc0.12.gb6a658bd00c9 In-Reply-To: <20210331191405.341999-1-ira.weiny@intel.com> References: <20210331191405.341999-1-ira.weiny@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Ira Weiny Protection Keys User (PKU) and Protection Keys Supervisor (PKS) work in similar fashions and can share common defines. Specifically PKS and PKU each have: 1. A single control register 2. The same number of keys 3. The same number of bits in the register per key 4. Access and Write disable in the same bit locations Given the above, share all the macros that synthesize and manipulate register values between the two features. Unlike PKU the PKS definitions are needed in both pgtable.h and pkeys.h. Create a common header for those 2 headers to share. The alternative, including pgtable.h in pkeys.h, triggers complex header dependencies. Share these defines by moving them into a new header, change their names to reflect the common use, and include the header where needed. Reviewed-by: Dan Williams Signed-off-by: Ira Weiny --- NOTE: The initialization of init_pkru_value cause checkpatch errors because of the space after the '(' in the macros. We leave this as is because it is more readable in this format. And it was existing code. --- Changes from V3: From Dan Williams Fix guard macro names Reword commit message. Changes from RFC V3 Per Dave Hansen Update commit message Add comment to PKR_AD_KEY macro --- arch/x86/include/asm/pgtable.h | 13 ++++++------- arch/x86/include/asm/pkeys.h | 2 ++ arch/x86/include/asm/pkeys_common.h | 15 +++++++++++++++ arch/x86/kernel/fpu/xstate.c | 8 ++++---- arch/x86/mm/pkeys.c | 14 ++++++-------- 5 files changed, 33 insertions(+), 19 deletions(-) create mode 100644 arch/x86/include/asm/pkeys_common.h diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfc..bfbfb951fe65 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1360,9 +1360,7 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) } #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ -#define PKRU_AD_BIT 0x1 -#define PKRU_WD_BIT 0x2 -#define PKRU_BITS_PER_PKEY 2 +#include #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS extern u32 init_pkru_value; @@ -1372,18 +1370,19 @@ extern u32 init_pkru_value; static inline bool __pkru_allows_read(u32 pkru, u16 pkey) { - int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY; - return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits)); + int pkru_pkey_bits = pkey * PKR_BITS_PER_PKEY; + + return !(pkru & (PKR_AD_BIT << pkru_pkey_bits)); } static inline bool __pkru_allows_write(u32 pkru, u16 pkey) { - int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY; + int pkru_pkey_bits = pkey * PKR_BITS_PER_PKEY; /* * Access-disable disables writes too so we need to check * both bits here. */ - return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits)); + return !(pkru & ((PKR_AD_BIT|PKR_WD_BIT) << pkru_pkey_bits)); } static inline u16 pte_flags_pkey(unsigned long pte_flags) diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h index 2ff9b98812b7..f9feba80894b 100644 --- a/arch/x86/include/asm/pkeys.h +++ b/arch/x86/include/asm/pkeys.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_PKEYS_H #define _ASM_X86_PKEYS_H +#include + #define ARCH_DEFAULT_PKEY 0 /* diff --git a/arch/x86/include/asm/pkeys_common.h b/arch/x86/include/asm/pkeys_common.h new file mode 100644 index 000000000000..e40b0ced733f --- /dev/null +++ b/arch/x86/include/asm/pkeys_common.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_PKEYS_COMMON_H +#define _ASM_X86_PKEYS_COMMON_H + +#define PKR_AD_BIT 0x1 +#define PKR_WD_BIT 0x2 +#define PKR_BITS_PER_PKEY 2 + +/* + * Generate an Access-Disable mask for the given pkey. Several of these can be + * OR'd together to generate pkey register values. + */ +#define PKR_AD_KEY(pkey) (PKR_AD_BIT << ((pkey) * PKR_BITS_PER_PKEY)) + +#endif /*_ASM_X86_PKEYS_COMMON_H */ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 683749b80ae2..face29dab0e3 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -995,7 +995,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val) { u32 old_pkru; - int pkey_shift = (pkey * PKRU_BITS_PER_PKEY); + int pkey_shift = (pkey * PKR_BITS_PER_PKEY); u32 new_pkru_bits = 0; /* @@ -1014,16 +1014,16 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, /* Set the bits we need in PKRU: */ if (init_val & PKEY_DISABLE_ACCESS) - new_pkru_bits |= PKRU_AD_BIT; + new_pkru_bits |= PKR_AD_BIT; if (init_val & PKEY_DISABLE_WRITE) - new_pkru_bits |= PKRU_WD_BIT; + new_pkru_bits |= PKR_WD_BIT; /* Shift the bits in to the correct place in PKRU for pkey: */ new_pkru_bits <<= pkey_shift; /* Get old PKRU and mask off any old bits in place: */ old_pkru = read_pkru(); - old_pkru &= ~((PKRU_AD_BIT|PKRU_WD_BIT) << pkey_shift); + old_pkru &= ~((PKR_AD_BIT|PKR_WD_BIT) << pkey_shift); /* Write old part along with new part: */ write_pkru(old_pkru | new_pkru_bits); diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c index 8873ed1438a9..f5efb4007e74 100644 --- a/arch/x86/mm/pkeys.c +++ b/arch/x86/mm/pkeys.c @@ -111,19 +111,17 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot, int pkey return vma_pkey(vma); } -#define PKRU_AD_KEY(pkey) (PKRU_AD_BIT << ((pkey) * PKRU_BITS_PER_PKEY)) - /* * Make the default PKRU value (at execve() time) as restrictive * as possible. This ensures that any threads clone()'d early * in the process's lifetime will not accidentally get access * to data which is pkey-protected later on. */ -u32 init_pkru_value = PKRU_AD_KEY( 1) | PKRU_AD_KEY( 2) | PKRU_AD_KEY( 3) | - PKRU_AD_KEY( 4) | PKRU_AD_KEY( 5) | PKRU_AD_KEY( 6) | - PKRU_AD_KEY( 7) | PKRU_AD_KEY( 8) | PKRU_AD_KEY( 9) | - PKRU_AD_KEY(10) | PKRU_AD_KEY(11) | PKRU_AD_KEY(12) | - PKRU_AD_KEY(13) | PKRU_AD_KEY(14) | PKRU_AD_KEY(15); +u32 init_pkru_value = PKR_AD_KEY( 1) | PKR_AD_KEY( 2) | PKR_AD_KEY( 3) | + PKR_AD_KEY( 4) | PKR_AD_KEY( 5) | PKR_AD_KEY( 6) | + PKR_AD_KEY( 7) | PKR_AD_KEY( 8) | PKR_AD_KEY( 9) | + PKR_AD_KEY(10) | PKR_AD_KEY(11) | PKR_AD_KEY(12) | + PKR_AD_KEY(13) | PKR_AD_KEY(14) | PKR_AD_KEY(15); /* * Called from the FPU code when creating a fresh set of FPU @@ -173,7 +171,7 @@ static ssize_t init_pkru_write_file(struct file *file, * up immediately if someone attempts to disable access * or writes to pkey 0. */ - if (new_init_pkru & (PKRU_AD_BIT|PKRU_WD_BIT)) + if (new_init_pkru & (PKR_AD_BIT|PKR_WD_BIT)) return -EINVAL; WRITE_ONCE(init_pkru_value, new_init_pkru); From patchwork Wed Mar 31 19:13:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 413176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2104EC43460 for ; Wed, 31 Mar 2021 19:15:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8FD161075 for ; Wed, 31 Mar 2021 19:15:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235734AbhCaTOl (ORCPT ); Wed, 31 Mar 2021 15:14:41 -0400 Received: from mga12.intel.com ([192.55.52.136]:51675 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234922AbhCaTOP (ORCPT ); Wed, 31 Mar 2021 15:14:15 -0400 IronPort-SDR: 4a+VVHoUFDGwOL79kYpTdzhhvAulnlx4UEVmZVkj+n3uvZ4ls1YhZ3nBi3Mmcst2CcLvT6mSYk 6DFV7Ep+zYEw== X-IronPort-AV: E=McAfee;i="6000,8403,9940"; a="171490593" X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="171490593" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:15 -0700 IronPort-SDR: IjFkLYsM8F1gNcD5ahlKsXkbhNxLpN98CrB7OFbcQ/xrtnUpvT6Hreol7Bk1K44pJKKSb8JsAt 9A/GZonk2yuQ== X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="610625378" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:15 -0700 From: ira.weiny@intel.com To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra Cc: Ira Weiny , Dan Williams , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, Fenghua Yu , linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH V5 02/10] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support Date: Wed, 31 Mar 2021 12:13:57 -0700 Message-Id: <20210331191405.341999-3-ira.weiny@intel.com> X-Mailer: git-send-email 2.28.0.rc0.12.gb6a658bd00c9 In-Reply-To: <20210331191405.341999-1-ira.weiny@intel.com> References: <20210331191405.341999-1-ira.weiny@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Ira Weiny Define a helper, update_pkey_val(), which will be used to support both Protection Key User (PKU) and the new Protection Key for Supervisor (PKS) in subsequent patches. Reviewed-by: Dan Williams Co-developed-by: Peter Zijlstra Signed-off-by: Peter Zijlstra Signed-off-by: Ira Weiny --- Changes from RFC V3: Per Dave Hansen Update and add comments per Dave's review Per Peter Correct attribution --- arch/x86/include/asm/pkeys.h | 2 ++ arch/x86/kernel/fpu/xstate.c | 22 ++++------------------ arch/x86/mm/pkeys.c | 23 +++++++++++++++++++++++ 3 files changed, 29 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h index f9feba80894b..4526245b03e5 100644 --- a/arch/x86/include/asm/pkeys.h +++ b/arch/x86/include/asm/pkeys.h @@ -136,4 +136,6 @@ static inline int vma_pkey(struct vm_area_struct *vma) return (vma->vm_flags & vma_pkey_mask) >> VM_PKEY_SHIFT; } +u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int flags); + #endif /*_ASM_X86_PKEYS_H */ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index face29dab0e3..00251bdf759b 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -994,9 +994,7 @@ const void *get_xsave_field_ptr(int xfeature_nr) int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val) { - u32 old_pkru; - int pkey_shift = (pkey * PKR_BITS_PER_PKEY); - u32 new_pkru_bits = 0; + u32 pkru; /* * This check implies XSAVE support. OSPKE only gets @@ -1012,21 +1010,9 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, */ WARN_ON_ONCE(pkey >= arch_max_pkey()); - /* Set the bits we need in PKRU: */ - if (init_val & PKEY_DISABLE_ACCESS) - new_pkru_bits |= PKR_AD_BIT; - if (init_val & PKEY_DISABLE_WRITE) - new_pkru_bits |= PKR_WD_BIT; - - /* Shift the bits in to the correct place in PKRU for pkey: */ - new_pkru_bits <<= pkey_shift; - - /* Get old PKRU and mask off any old bits in place: */ - old_pkru = read_pkru(); - old_pkru &= ~((PKR_AD_BIT|PKR_WD_BIT) << pkey_shift); - - /* Write old part along with new part: */ - write_pkru(old_pkru | new_pkru_bits); + pkru = read_pkru(); + pkru = update_pkey_val(pkru, pkey, init_val); + write_pkru(pkru); return 0; } diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c index f5efb4007e74..d1dfe743e79f 100644 --- a/arch/x86/mm/pkeys.c +++ b/arch/x86/mm/pkeys.c @@ -208,3 +208,26 @@ static __init int setup_init_pkru(char *opt) return 1; } __setup("init_pkru=", setup_init_pkru); + +/* + * Replace disable bits for @pkey with values from @flags + * + * Kernel users use the same flags as user space: + * PKEY_DISABLE_ACCESS + * PKEY_DISABLE_WRITE + */ +u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int flags) +{ + int pkey_shift = pkey * PKR_BITS_PER_PKEY; + + /* Mask out old bit values */ + pk_reg &= ~(((1 << PKR_BITS_PER_PKEY) - 1) << pkey_shift); + + /* Or in new values */ + if (flags & PKEY_DISABLE_ACCESS) + pk_reg |= PKR_AD_BIT << pkey_shift; + if (flags & PKEY_DISABLE_WRITE) + pk_reg |= PKR_WD_BIT << pkey_shift; + + return pk_reg; +} From patchwork Wed Mar 31 19:13:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 413174 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC512C43611 for ; Wed, 31 Mar 2021 19:15:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C74B561075 for ; Wed, 31 Mar 2021 19:15:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235975AbhCaTOp (ORCPT ); Wed, 31 Mar 2021 15:14:45 -0400 Received: from mga12.intel.com ([192.55.52.136]:51675 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235071AbhCaTOS (ORCPT ); Wed, 31 Mar 2021 15:14:18 -0400 IronPort-SDR: I7FsLRxJyhFpLXFSgbR6dUrX6EkQSj+YYdbSR1s43r3e1hja6KG1kpzraLyBFYtg0MlnQB72DL xaH9tMdBzNPA== X-IronPort-AV: E=McAfee;i="6000,8403,9940"; a="171490606" X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="171490606" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:18 -0700 IronPort-SDR: YIUMPilsVwhW1NiIvFp/vPcEZfTqHx1jz8UVgdCI34evD+lCCqhSLjqSJPFyQ63hOAiOKJiFdl 2kDXREDSMQIg== X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="610625390" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:18 -0700 From: ira.weiny@intel.com To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra Cc: Ira Weiny , Dan Williams , Fenghua Yu , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH V5 04/10] x86/pks: Add PKS defines and Kconfig options Date: Wed, 31 Mar 2021 12:13:59 -0700 Message-Id: <20210331191405.341999-5-ira.weiny@intel.com> X-Mailer: git-send-email 2.28.0.rc0.12.gb6a658bd00c9 In-Reply-To: <20210331191405.341999-1-ira.weiny@intel.com> References: <20210331191405.341999-1-ira.weiny@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Ira Weiny Protection Keys for Supervisor pages (PKS) enables fast, hardware thread specific, manipulation of permission restrictions on supervisor page mappings. It uses the same mechanism of Protection Keys as those on User mappings but applies that mechanism to supervisor mappings using a supervisor specific MSR. Kernel users can define domains of page mappings which have an extra level of protection beyond those specified in the supervisor page table entries. Define the PKS CPU feature bits. Add the Kconfig ARCH_HAS_SUPERVISOR_PKEYS to indicate to consumers that an architecture supports pkeys. Introduce ARCH_ENABLE_SUPERVISOR_PKEYS to allow kernel users to specify to the arch that they wish to use the supervisor key support if ARCH_HAS_SUPERVISOR_PKEYS is available. ARCH_ENABLE_SUPERVISOR_PKEYS remains off until the first use case sets it. Reviewed-by: Dan Williams Co-developed-by: Fenghua Yu Signed-off-by: Fenghua Yu Signed-off-by: Ira Weiny --- Changes from V3: From Dan Clean up commit message Add ARCH_ENABLE_SUPERVISOR_PKEYS option so we don't have the overhead of PKS unless there is a user Clean up commit message grammar Changes from V2 New patch for V3: Split this off from the enable patch to be able to create cleaner bisectability --- arch/x86/Kconfig | 1 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 8 +++++++- arch/x86/include/uapi/asm/processor-flags.h | 2 ++ mm/Kconfig | 4 ++++ 5 files changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2792879d398e..5e3a7c2bc342 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1870,6 +1870,7 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS depends on X86_64 && (CPU_SUP_INTEL || CPU_SUP_AMD) select ARCH_USES_HIGH_VMA_FLAGS select ARCH_HAS_PKEYS + select ARCH_HAS_SUPERVISOR_PKEYS help Memory Protection Keys provides a mechanism for enforcing page-based protections, but without requiring modification of the diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index cc96e26d69f7..83ed73407417 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -359,6 +359,7 @@ #define X86_FEATURE_MOVDIR64B (16*32+28) /* MOVDIR64B instruction */ #define X86_FEATURE_ENQCMD (16*32+29) /* ENQCMD and ENQCMDS instructions */ #define X86_FEATURE_SGX_LC (16*32+30) /* Software Guard Extensions Launch Control */ +#define X86_FEATURE_PKS (16*32+31) /* Protection Keys for Supervisor pages */ /* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */ #define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery support */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index b7dd944dc867..fd09ae852c04 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -44,6 +44,12 @@ # define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31)) #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS +# define DISABLE_PKS 0 +#else +# define DISABLE_PKS (1<<(X86_FEATURE_PKS & 31)) +#endif + #ifdef CONFIG_X86_5LEVEL # define DISABLE_LA57 0 #else @@ -88,7 +94,7 @@ #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \ - DISABLE_ENQCMD) + DISABLE_ENQCMD|DISABLE_PKS) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 #define DISABLED_MASK19 0 diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index bcba3c643e63..191c574b2390 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -130,6 +130,8 @@ #define X86_CR4_SMAP _BITUL(X86_CR4_SMAP_BIT) #define X86_CR4_PKE_BIT 22 /* enable Protection Keys support */ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) +#define X86_CR4_PKS_BIT 24 /* enable Protection Keys for Supervisor */ +#define X86_CR4_PKS _BITUL(X86_CR4_PKS_BIT) /* * x86-64 Task Priority Register, CR8 diff --git a/mm/Kconfig b/mm/Kconfig index 24c045b24b95..c7d1fc780358 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -808,6 +808,10 @@ config ARCH_USES_HIGH_VMA_FLAGS bool config ARCH_HAS_PKEYS bool +config ARCH_HAS_SUPERVISOR_PKEYS + bool +config ARCH_ENABLE_SUPERVISOR_PKEYS + bool config PERCPU_STATS bool "Collect percpu memory statistics" From patchwork Wed Mar 31 19:14:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 413172 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E54DC001B3 for ; Wed, 31 Mar 2021 19:15:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF2DE61002 for ; Wed, 31 Mar 2021 19:15:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236114AbhCaTOt (ORCPT ); Wed, 31 Mar 2021 15:14:49 -0400 Received: from mga12.intel.com ([192.55.52.136]:51691 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235405AbhCaTOV (ORCPT ); Wed, 31 Mar 2021 15:14:21 -0400 IronPort-SDR: 5k71VMr2/yBW7rHragUdlkEIoBz1tNdr9QgVLZmSkVi0WiQUIEhlqvxd2SELt5t+oRdshkGzQL mg+6S7gMQVPg== X-IronPort-AV: E=McAfee;i="6000,8403,9940"; a="171490621" X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="171490621" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:21 -0700 IronPort-SDR: v3r3EqCUzTxSAiOJIcRSsUeUO25MfgocMvV/JzJ378UWSIrfAy23AgGmhUF5ueqFNi774Q9ztj AgbPzb40ZxyQ== X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="610625408" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:21 -0700 From: ira.weiny@intel.com To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra Cc: Ira Weiny , Dan Williams , Fenghua Yu , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH V5 07/10] x86/pks: Preserve the PKRS MSR on context switch Date: Wed, 31 Mar 2021 12:14:02 -0700 Message-Id: <20210331191405.341999-8-ira.weiny@intel.com> X-Mailer: git-send-email 2.28.0.rc0.12.gb6a658bd00c9 In-Reply-To: <20210331191405.341999-1-ira.weiny@intel.com> References: <20210331191405.341999-1-ira.weiny@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Ira Weiny The PKRS MSR is defined as a per-logical-processor register. This isolates memory access by logical CPU. Unfortunately, the MSR is not managed by XSAVE. Therefore, tasks must save/restore the MSR value on context switch. Define a saved PKRS value in the task struct, as well as a cached per-logical-processor MSR value which mirrors the MSR value of the current CPU. Initialize all tasks with the default MSR value. Then, on schedule in, call write_pkrs() which automatically avoids the overhead of the MSR write if possible. Reviewed-by: Dan Williams Co-developed-by: Fenghua Yu Signed-off-by: Fenghua Yu Signed-off-by: Ira Weiny --- Changes from V4 From kernel test robot Fix i386 build: pks_init_task not found Move MSR_IA32_PKRS and INIT_PKRS_VALUE to patch 5 where they are 'used'. (Technically nothing is used until the final test patch but this organization makes review better.) Fix checkpatch errors Changes from V3 From Dan Williams make pks_init_task() and pks_sched_in() macros To avoid Supervisor PKey '#ifdefery' in process.c and process_64.c Split write_pkrs() to an earlier patch to be used in setup_pks() Move Peter's authorship to that patch. From Dan Williams Use ARCH_ENABLE_SUPERVISOR_PKEYS Remove kernel doc comment from write_pkrs From Thomas Gleixner Fix where pks_sched_in() is called from. Should be called from __switch_to() NOTE: PKS requires x86_64 so there is no need to update process_32.c Make pkrs_cache static Remove unnecessary pkrs_cache declaration Clean up formatting Changes from V2 Adjust for PKS enable being final patch. Changes from V1 Rebase to latest tip/master Resolve conflicts with INIT_THREAD changes Changes since RFC V3 Per Dave Hansen Update commit message move saved_pkrs to be in a nicer place Per Peter Zijlstra Add Comment from Peter Clean up white space Update authorship --- arch/x86/include/asm/processor.h | 47 +++++++++++++++++++++++++++++++- arch/x86/kernel/process.c | 3 ++ arch/x86/kernel/process_64.c | 2 ++ 3 files changed, 51 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index dc6d149bf851..e0ffb9c849c5 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -18,6 +18,7 @@ struct vm86; #include #include #include +#include #include #include #include @@ -519,6 +520,12 @@ struct thread_struct { unsigned long cr2; unsigned long trap_nr; unsigned long error_code; + +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS + /* Saved Protection key register for supervisor mappings */ + u32 saved_pkrs; +#endif + #ifdef CONFIG_VM86 /* Virtual 86 mode info */ struct vm86 *vm86; @@ -775,6 +782,37 @@ static inline void spin_lock_prefetch(const void *x) ((struct pt_regs *)__ptr) - 1; \ }) +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS + +void write_pkrs(u32 new_pkrs); + +/* + * Define pks_init_task and pks_sched_in as macros to avoid requiring the + * definition of struct task_struct in this header while keeping the supervisor + * pkey #ifdefery out of process.c and process_64.c + */ + +/* + * New tasks get the most restrictive PKRS value. + */ +#define pks_init_task(tsk) \ + tsk->thread.saved_pkrs = INIT_PKRS_VALUE + +/* + * PKRS is only temporarily changed during specific code paths. Only a + * preemption during these windows away from the default value would + * require updating the MSR. write_pkrs() handles this optimization. + */ +#define pks_sched_in() \ + write_pkrs(current->thread.saved_pkrs) + +#else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */ + +#define pks_init_task(tsk) +#define pks_sched_in() + +#endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */ + #ifdef CONFIG_X86_32 #define INIT_THREAD { \ .sp0 = TOP_OF_INIT_STACK, \ @@ -784,7 +822,14 @@ static inline void spin_lock_prefetch(const void *x) #define KSTK_ESP(task) (task_pt_regs(task)->sp) #else -#define INIT_THREAD { } + +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS +#define INIT_THREAD { \ + .saved_pkrs = INIT_PKRS_VALUE, \ +} +#else +#define INIT_THREAD { } +#endif extern unsigned long KSTK_ESP(struct task_struct *task); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 9c214d7085a4..89f8454a8541 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -43,6 +43,7 @@ #include #include #include +#include #include "process.h" @@ -195,6 +196,8 @@ void flush_thread(void) memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array)); fpu__clear_all(&tsk->thread.fpu); + + pks_init_task(tsk); } void disable_TSC(void) diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index d08307df69ad..e590ecac1650 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -632,6 +632,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) /* Load the Intel cache allocation PQR MSR. */ resctrl_sched_in(); + pks_sched_in(); + return prev_p; } From patchwork Wed Mar 31 19:14:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 413173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3518DC43618 for ; Wed, 31 Mar 2021 19:15:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 109C061076 for ; Wed, 31 Mar 2021 19:15:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236084AbhCaTOq (ORCPT ); Wed, 31 Mar 2021 15:14:46 -0400 Received: from mga12.intel.com ([192.55.52.136]:51675 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235608AbhCaTOY (ORCPT ); Wed, 31 Mar 2021 15:14:24 -0400 IronPort-SDR: wOfKUgTLS/QtQbzNgN2bSuQzoS8FCafDQDBkvrS0JfxRPd6PfYgbuHdmebkmp/gEZ7bw5fBQC0 PG9uYCG1Q6DA== X-IronPort-AV: E=McAfee;i="6000,8403,9940"; a="171490634" X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="171490634" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:24 -0700 IronPort-SDR: PkI1AG59GIpKt5hoDQWdBM05aHuqMOQDLnEYfbwnBw0wdNhmgaumSLgctMDEqBrCQHyUL5NWqM 3Qra8MA1CvYA== X-IronPort-AV: E=Sophos;i="5.81,293,1610438400"; d="scan'208";a="418821851" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2021 12:14:23 -0700 From: ira.weiny@intel.com To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra Cc: Fenghua Yu , Sean Christopherson , Dan Williams , Ira Weiny , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH V5 09/10] x86/pks: Add PKS kernel API Date: Wed, 31 Mar 2021 12:14:04 -0700 Message-Id: <20210331191405.341999-10-ira.weiny@intel.com> X-Mailer: git-send-email 2.28.0.rc0.12.gb6a658bd00c9 In-Reply-To: <20210331191405.341999-1-ira.weiny@intel.com> References: <20210331191405.341999-1-ira.weiny@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org From: Fenghua Yu PKS allows kernel users to define domains of page mappings which have additional protections beyond the paging protections. Violating those protections results in an oops. Add an API to allocate, use, and free a protection key which identifies such a domain. Export 5 new symbols pks_key_alloc(), pks_mk_noaccess(), pks_mk_readonly(), pks_mk_readwrite(), and pks_key_free(). Add 2 new macros; PAGE_KERNEL_PKEY(key) and _PAGE_PKEY(pkey). Update the protection key documentation to cover pkeys on supervisor pages. Cc: Sean Christopherson Reviewed-by: Dan Williams Co-developed-by: Ira Weiny Signed-off-by: Ira Weiny Signed-off-by: Fenghua Yu --- Changes from V4: From Sean Christopherson Add what happens if the pkey is violated Changes from V3: From Dan Williams Remove flags from pks_key_alloc() Convert to ARCH_ENABLE_SUPERVISOR_PKEYS Update documentation for ARCH_ENABLE_SUPERVISOR_PKEYS No need to export write_pkrs Correct Kernel Doc for API functions From Dan Williams remove export of update_pkey_val() Update documentation change __clear_bit to clear_bit_unlock remove cpu_feature_enabled from pks_key_free remove pr_err stubs when CONFIG_HAS_SUPERVISOR_PKEYS=n clarify pks_key_alloc flags parameter with enum From Randy Dunlap: Fix grammatical errors in doc Changes from V2 From Greg KH Replace all WARN_ON_ONCE() uses with pr_err() From Dan Williams Add __must_check to pks_key_alloc() to help ensure users are using the API correctly Changes from V1 Per Dave Hansen Add flags to pks_key_alloc() to help future proof the interface if/when the key space is exhausted. Changes from RFC V3 Per Dave Hansen Put WARN_ON_ONCE in pks_key_free() s/pks_mknoaccess/pks_mk_noaccess/ s/pks_mkread/pks_mk_readonly/ s/pks_mkrdwr/pks_mk_readwrite/ Change return pks_key_alloc() to EOPNOTSUPP when not supported or configured Per Peter Zijlstra Remove unneeded preempt disable/enable --- Documentation/core-api/protection-keys.rst | 109 +++++++++++++--- arch/x86/include/asm/pgtable_types.h | 12 ++ arch/x86/include/asm/pks.h | 4 + arch/x86/mm/pkeys.c | 137 ++++++++++++++++++++- include/linux/pgtable.h | 4 + include/linux/pkeys.h | 17 +++ 6 files changed, 264 insertions(+), 19 deletions(-) diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst index ec575e72d0b2..5b1b90d8bdab 100644 --- a/Documentation/core-api/protection-keys.rst +++ b/Documentation/core-api/protection-keys.rst @@ -4,25 +4,30 @@ Memory Protection Keys ====================== -Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature -which is found on Intel's Skylake (and later) "Scalable Processor" -Server CPUs. It will be available in future non-server Intel parts -and future AMD processors. +Memory Protection Keys provide a mechanism for enforcing page-based +protections, but without requiring modification of the page tables +when an application changes protection domains. -For anyone wishing to test or use this feature, it is available in -Amazon's EC2 C5 instances and is known to work there using an Ubuntu -17.04 image. +PKeys Userspace (PKU) is a feature which is found on Intel's Skylake "Scalable +Processor" Server CPUs and later. And it will be available in future +non-server Intel parts and future AMD processors. -Memory Protection Keys provides a mechanism for enforcing page-based -protections, but without requiring modification of the page tables -when an application changes protection domains. It works by -dedicating 4 previously ignored bits in each page table entry to a -"protection key", giving 16 possible keys. +Protection Keys for Supervisor pages (PKS) is available in the SDM since May +2020. + +pkeys work by dedicating 4 previously Reserved bits in each page table entry to +a "protection key", giving 16 possible keys. User and Supervisor pages are +treated separately. -There is also a new user-accessible register (PKRU) with two separate -bits (Access Disable and Write Disable) for each key. Being a CPU -register, PKRU is inherently thread-local, potentially giving each -thread a different set of protections from every other thread. +Protections for each page are controlled with per-CPU registers for each type +of page User and Supervisor. Each of these 32-bit register stores two separate +bits (Access Disable and Write Disable) for each key. + +For Userspace the register is user-accessible (rdpkru/wrpkru). For +Supervisor, the register (MSR_IA32_PKRS) is accessible only to the kernel. + +Being a CPU register, pkeys are inherently thread-local, potentially giving +each thread an independent set of protections from every other thread. There are two new instructions (RDPKRU/WRPKRU) for reading and writing to the new register. The feature is only available in 64-bit mode, @@ -30,8 +35,11 @@ even though there is theoretically space in the PAE PTEs. These permissions are enforced on data access only and have no effect on instruction fetches. -Syscalls -======== +For kernel space rdmsr/wrmsr are used to access the kernel MSRs. + + +Syscalls for user space keys +============================ There are 3 system calls which directly interact with pkeys:: @@ -98,3 +106,68 @@ with a read():: The kernel will send a SIGSEGV in both cases, but si_code will be set to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when the plain mprotect() permissions are violated. + + +Kernel API for PKS support +========================== + +Similar to user space pkeys, supervisor pkeys allow additional protections to +be defined for a supervisor mappings. Unlike user space pkeys, Violations of +these protections result in a a kernel oops. + +The following interface is used to allocate, use, and free a pkey which defines +a 'protection domain' within the kernel. Setting a pkey value in a supervisor +PTE adds this additional protection to the page. + +Kernel users intending to use PKS support should check (depend on) +ARCH_HAS_SUPERVISOR_PKEYS and add their config to ARCH_ENABLE_SUPERVISOR_PKEYS +to turn on this support within the core. + + int pks_key_alloc(const char * const pkey_user); + #define PAGE_KERNEL_PKEY(pkey) + #define _PAGE_KEY(pkey) + void pks_mk_noaccess(int pkey); + void pks_mk_readonly(int pkey); + void pks_mk_readwrite(int pkey); + void pks_key_free(int pkey); + +pks_key_alloc() allocates keys dynamically to allow better use of the limited +key space. + +Callers of pks_key_alloc() _must_ be prepared for it to fail and take +appropriate action. This is due mainly to the fact that PKS may not be +available on all arch's. Failure to check the return of pks_key_alloc() and +using any of the rest of the API is undefined. + +Keys are allocated with 'No Access' permissions. If other permissions are +required before the pkey is used, the pks_mk*() family of calls, documented +below, can be used prior to setting the pkey within the page table entries. + +Kernel users must set the pkey in the page table entries for the mappings they +want to protect. This can be done with PAGE_KERNEL_PKEY() or _PAGE_KEY(). + +The pks_mk*() family of calls allows kernel users to change the protections for +the domain identified by the pkey parameter. 3 states are available: +pks_mk_noaccess(), pks_mk_readonly(), and pks_mk_readwrite() which set the +access to none, read, and read/write respectively. + +Finally, pks_key_free() allows a user to return the key to the allocator for +use by others. + +The interface maintains pks_mk_noaccess() (Access Disabled (AD=1)) for all keys +not currently allocated. Therefore, the user can depend on access being +disabled when pks_key_alloc() returns a key and the user should remove mappings +from the domain (remove the pkey from the PTE) prior to calling pks_key_free(). + +It should be noted that the underlying WRMSR(MSR_IA32_PKRS) is not serializing +but still maintains ordering properties similar to WRPKRU. Thus it is safe to +immediately use a mapping when the pks_mk*() functions return. + +Older versions of the SDM on PKRS may be wrong with regard to this +serialization. The text should be the same as that of WRPKRU. From the WRPKRU +text: + + WRPKRU will never execute transiently. Memory accesses + affected by PKRU register will not execute (even transiently) + until all prior executions of WRPKRU have completed execution + and updated the PKRU register. diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index f24d7ef8fffa..a3cb274351d9 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -73,6 +73,12 @@ _PAGE_PKEY_BIT2 | \ _PAGE_PKEY_BIT3) +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS +#define _PAGE_PKEY(pkey) (_AT(pteval_t, pkey) << _PAGE_BIT_PKEY_BIT0) +#else +#define _PAGE_PKEY(pkey) (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY | _PAGE_ACCESSED) #else @@ -228,6 +234,12 @@ enum page_cache_mode { #define PAGE_KERNEL_IO __pgprot_mask(__PAGE_KERNEL_IO) #define PAGE_KERNEL_IO_NOCACHE __pgprot_mask(__PAGE_KERNEL_IO_NOCACHE) +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS +#define PAGE_KERNEL_PKEY(pkey) __pgprot_mask(__PAGE_KERNEL | _PAGE_PKEY(pkey)) +#else +#define PAGE_KERNEL_PKEY(pkey) PAGE_KERNEL +#endif + #endif /* __ASSEMBLY__ */ /* xwr */ diff --git a/arch/x86/include/asm/pks.h b/arch/x86/include/asm/pks.h index 516d82f032b6..2fff4ac484b9 100644 --- a/arch/x86/include/asm/pks.h +++ b/arch/x86/include/asm/pks.h @@ -4,6 +4,10 @@ #ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS +/* PKS supports 16 keys. Key 0 is reserved for the kernel. */ +#define PKS_KERN_DEFAULT_KEY 0 +#define PKS_NUM_KEYS 16 + struct extended_pt_regs { u32 thread_pkrs; /* Keep stack 8 byte aligned */ diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c index f6a3a54b8d7d..47d29707ac39 100644 --- a/arch/x86/mm/pkeys.c +++ b/arch/x86/mm/pkeys.c @@ -3,6 +3,9 @@ * Intel Memory Protection Keys management * Copyright (c) 2015, Intel Corporation. */ +#undef pr_fmt +#define pr_fmt(fmt) "x86/pkeys: " fmt + #include /* debugfs_create_u32() */ #include /* mm_struct, vma, etc... */ #include /* PKEY_* */ @@ -11,6 +14,7 @@ #include /* boot_cpu_has, ... */ #include /* vma_pkey() */ #include /* init_fpstate */ +#include int __execute_only_pkey(struct mm_struct *mm) { @@ -276,4 +280,135 @@ void setup_pks(void) cr4_set_bits(X86_CR4_PKS); } -#endif +/* + * Do not call this directly, see pks_mk*() below. + * + * @pkey: Key for the domain to change + * @protection: protection bits to be used + * + * Protection utilizes the same protection bits specified for User pkeys + * PKEY_DISABLE_ACCESS + * PKEY_DISABLE_WRITE + * + */ +static inline void pks_update_protection(int pkey, unsigned long protection) +{ + current->thread.saved_pkrs = update_pkey_val(current->thread.saved_pkrs, + pkey, protection); + write_pkrs(current->thread.saved_pkrs); +} + +/** + * pks_mk_noaccess() - Disable all access to the domain + * @pkey the pkey for which the access should change. + * + * Disable all access to the domain specified by pkey. This is a global + * update and only affects the current running thread. + * + * It is a bug for users to call this without a valid pkey returned from + * pks_key_alloc() + */ +void pks_mk_noaccess(int pkey) +{ + pks_update_protection(pkey, PKEY_DISABLE_ACCESS); +} +EXPORT_SYMBOL_GPL(pks_mk_noaccess); + +/** + * pks_mk_readonly() - Make the domain Read only + * @pkey the pkey for which the access should change. + * + * Allow read access to the domain specified by pkey. This is a global update + * and only affects the current running thread. + * + * It is a bug for users to call this without a valid pkey returned from + * pks_key_alloc() + */ +void pks_mk_readonly(int pkey) +{ + pks_update_protection(pkey, PKEY_DISABLE_WRITE); +} +EXPORT_SYMBOL_GPL(pks_mk_readonly); + +/** + * pks_mk_readwrite() - Make the domain Read/Write + * @pkey the pkey for which the access should change. + * + * Allow all access, read and write, to the domain specified by pkey. This is + * a global update and only affects the current running thread. + * + * It is a bug for users to call this without a valid pkey returned from + * pks_key_alloc() + */ +void pks_mk_readwrite(int pkey) +{ + pks_update_protection(pkey, 0); +} +EXPORT_SYMBOL_GPL(pks_mk_readwrite); + +static const char pks_key_user0[] = "kernel"; + +/* Store names of allocated keys for debug. Key 0 is reserved for the kernel. */ +static const char *pks_key_users[PKS_NUM_KEYS] = { + pks_key_user0 +}; + +/* + * Each key is represented by a bit. Bit 0 is set for key 0 and reserved for + * its use. We use ulong for the bit operations but only 16 bits are used. + */ +static unsigned long pks_key_allocation_map = 1 << PKS_KERN_DEFAULT_KEY; + +/** + * pks_key_alloc() - Allocate a PKS key + * @pkey_user: String stored for debugging of key exhaustion. The caller is + * responsible to maintain this memory until pks_key_free(). + * + * Return: pkey if success + * -EOPNOTSUPP if pks is not supported or not enabled + * -ENOSPC if no keys are available + */ +__must_check int pks_key_alloc(const char * const pkey_user) +{ + int nr; + + if (!cpu_feature_enabled(X86_FEATURE_PKS)) + return -EOPNOTSUPP; + + while (1) { + nr = find_first_zero_bit(&pks_key_allocation_map, PKS_NUM_KEYS); + if (nr >= PKS_NUM_KEYS) { + pr_info("Cannot allocate supervisor key for %s.\n", + pkey_user); + return -ENOSPC; + } + if (!test_and_set_bit_lock(nr, &pks_key_allocation_map)) + break; + } + + /* for debugging key exhaustion */ + pks_key_users[nr] = pkey_user; + + return nr; +} +EXPORT_SYMBOL_GPL(pks_key_alloc); + +/** + * pks_key_free() - Free a previously allocate PKS key + * @pkey: Key to be free'ed + */ +void pks_key_free(int pkey) +{ + if (pkey >= PKS_NUM_KEYS || pkey <= PKS_KERN_DEFAULT_KEY) { + pr_err("Invalid PKey value: %d\n", pkey); + return; + } + + /* Restore to default of no access */ + pks_mk_noaccess(pkey); + pks_key_users[pkey] = NULL; + clear_bit_unlock(pkey, &pks_key_allocation_map); +} +EXPORT_SYMBOL_GPL(pks_key_free); + +#endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5e772392a379..e189e9ab6904 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1464,6 +1464,10 @@ static inline bool arch_has_pfn_modify_check(void) # define PAGE_KERNEL_EXEC PAGE_KERNEL #endif +#ifndef PAGE_KERNEL_PKEY +#define PAGE_KERNEL_PKEY(pkey) PAGE_KERNEL +#endif + /* * Page Table Modification bits for pgtbl_mod_mask. * diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h index a3d17a8e4e81..6659404af876 100644 --- a/include/linux/pkeys.h +++ b/include/linux/pkeys.h @@ -56,6 +56,13 @@ static inline void copy_init_pkru_to_fpregs(void) void pkrs_save_set_irq(struct pt_regs *regs, u32 val); void pkrs_restore_irq(struct pt_regs *regs); +__must_check int pks_key_alloc(const char *const pkey_user); +void pks_key_free(int pkey); + +void pks_mk_noaccess(int pkey); +void pks_mk_readonly(int pkey); +void pks_mk_readwrite(int pkey); + #else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */ #ifndef INIT_PKRS_VALUE @@ -65,6 +72,16 @@ void pkrs_restore_irq(struct pt_regs *regs); static inline void pkrs_save_set_irq(struct pt_regs *regs, u32 val) { } static inline void pkrs_restore_irq(struct pt_regs *regs) { } +static inline __must_check int pks_key_alloc(const char * const pkey_user) +{ + return -EOPNOTSUPP; +} + +static inline void pks_key_free(int pkey) {} +static inline void pks_mk_noaccess(int pkey) {} +static inline void pks_mk_readonly(int pkey) {} +static inline void pks_mk_readwrite(int pkey) {} + #endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */ #endif /* _LINUX_PKEYS_H */