diff mbox series

[v3,2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle

Message ID 20220325022219.829-3-chang.seok.bae@intel.com
State Superseded
Headers show
Series x86/fpu: Make AMX state ready for CPU idle | expand

Commit Message

Chang S. Bae March 25, 2022, 2:22 a.m. UTC
When a CPU enters an idle state, non-initialized states left in large
registers may be the cause of preventing deeper low-power states.

The new helper ensures the AMX state is initialized to make the CPU
ready for low-power states. It will be used by the intel idle driver.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v2:
* Check the feature flag instead of fpu_state_size_dynamic() (Dave Hansen).

Changes from v1:
* Check the dynamic state flag first, to avoid #UD with XGETBV(1).
---
 arch/x86/include/asm/fpu/api.h       |  2 ++
 arch/x86/include/asm/special_insns.h |  9 +++++++++
 arch/x86/kernel/fpu/core.c           | 13 +++++++++++++
 3 files changed, 24 insertions(+)

Comments

Thomas Gleixner April 3, 2022, 4:37 p.m. UTC | #1
On Thu, Mar 24 2022 at 19:22, Chang S. Bae wrote:
> When a CPU enters an idle state, non-initialized states left in large
> registers may be the cause of preventing deeper low-power states.
>
> The new helper ensures the AMX state is initialized to make the CPU
> ready for low-power states. It will be used by the intel idle driver.

What about AVX...AVX512? Are they harmless in that regard?

If so, then the first sentence above is confusing and should clearly
spell out that it's AMX which causes that problem.

In not, then why are we not putting them into init too?

Thanks,

        tglx
Chang S. Bae April 5, 2022, 12:24 a.m. UTC | #2
On 4/3/2022 9:37 AM, Thomas Gleixner wrote:
> On Thu, Mar 24 2022 at 19:22, Chang S. Bae wrote:
>> When a CPU enters an idle state, non-initialized states left in large
>> registers may be the cause of preventing deeper low-power states.
>>
>> The new helper ensures the AMX state is initialized to make the CPU
>> ready for low-power states. It will be used by the intel idle driver.
> 
> What about AVX...AVX512? Are they harmless in that regard?

I double-checked with HW folks. Yes, that's the case. Their state 
whether initialized or not does not prevent the processor from entering 
a deeper low-power state.

> 
> If so, then the first sentence above is confusing and should clearly
> spell out that it's AMX which causes that problem.

I see. Will do that.

Thanks,
Chang
diff mbox series

Patch

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index c83b3020350a..df48912fd1c8 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -165,4 +165,6 @@  static inline bool fpstate_is_confidential(struct fpu_guest *gfpu)
 struct task_struct;
 extern long fpu_xstate_prctl(struct task_struct *tsk, int option, unsigned long arg2);
 
+extern void fpu_idle_fpregs(void);
+
 #endif /* _ASM_X86_FPU_API_H */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 68c257a3de0d..d434fbaeb3ff 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -294,6 +294,15 @@  static inline int enqcmds(void __iomem *dst, const void *src)
 	return 0;
 }
 
+static inline void tile_release(void)
+{
+	/*
+	 * Instruction opcode for TILERELEASE; supported in binutils
+	 * version >= 2.36.
+	 */
+	asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0");
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_SPECIAL_INSNS_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8dea01ffc5c1..3507609e22d7 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -847,3 +847,16 @@  int fpu__exception_code(struct fpu *fpu, int trap_nr)
 	 */
 	return 0;
 }
+
+/*
+ * Initialize register state that may prevent from entering low-power idle.
+ * This function will be invoked from the cpuidle driver only when needed.
+ */
+void fpu_idle_fpregs(void)
+{
+	if (cpu_feature_enabled(X86_FEATURE_XGETBV1) &&
+	    (xfeatures_in_use() & XFEATURE_MASK_XTILE)) {
+		tile_release();
+		fpregs_deactivate(&current->thread.fpu);
+	}
+}