mbox series

[v4,0/2] x86/fpu: Make AMX state ready for CPU idle

Message ID 20220517222430.24524-1-chang.seok.bae@intel.com
Headers show
Series x86/fpu: Make AMX state ready for CPU idle | expand

Message

Chang S. Bae May 17, 2022, 10:24 p.m. UTC
Hi,

First of all, my apologies for the long delay with my vacation.

Changes from v3 [1]:
* Call out AMX state only needs to be initialized for CPU idle (Thomas).
* Drop the XCR0 accessor changes as not relevant (Dave).

The series is available here:
  git://github.com/intel/amx-linux.git tilerelease

=== Original Cover Letter ===

AMX state is a large state (at least 8KB or more). Entering CPU idle with
this non-initialized large state may result in shallow states while a
deeper low-power state is available.

We can confirm this behavior is implementation-specific. Section 3.3 in [2]
will be updated to clarify this.

Thanks,
Chang

[1]: https://lore.kernel.org/lkml/20220325022219.829-1-chang.seok.bae@intel.com/
[2]: Intel Architecture Instruction Set Extension Programming Reference
     May 2021, https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf

Chang S. Bae (2):
  x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
  intel_idle: Add a new flag to initialize the AMX state

 arch/x86/include/asm/fpu/api.h       |  2 ++
 arch/x86/include/asm/special_insns.h |  9 +++++++++
 arch/x86/kernel/fpu/core.c           | 13 +++++++++++++
 drivers/idle/intel_idle.c            | 18 ++++++++++++++++--
 4 files changed, 40 insertions(+), 2 deletions(-)


base-commit: 42226c989789d8da4af1de0c31070c96726d990c

Comments

Dave Hansen May 18, 2022, 3:41 p.m. UTC | #1
On 5/17/22 15:24, Chang S. Bae wrote:
> +/*
> + * Initialize register state that may prevent from entering low-power idle.
> + * This function will be invoked from the cpuidle driver only when needed.
> + */
> +void fpu_idle_fpregs(void)
> +{
> +	if (cpu_feature_enabled(X86_FEATURE_XGETBV1) &&
> +	    (xfeatures_in_use() & XFEATURE_MASK_XTILE)) {
> +		tile_release();
> +		fpregs_deactivate(&current->thread.fpu);
> +	}
> +}

This is a pretty minor nit, but:

X86_FEATURE_XFD depends on X86_FEATURE_XGETBV1

and

X86_FEATURE_AMX_TILE depends on X86_FEATURE_XFD

via cpu_deps[].  So there is an implicit dependency all the way from AMX
to XGETBV1.  It's also not patently obvious what X86_FEATURE_XGETBV1 has
to do with the rest of the if().

Would this make more logical sense to folks?

	/* Note: AMX_TILE being enabled implies XGETBV1 support */
	if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) &&
	    (xfeatures_in_use() & XFEATURE_MASK_XTILE)) {
		tile_release();
		fpregs_deactivate(&current->thread.fpu);
	}

That also has a nice side effect that non-AMX systems will get to use a
static branch and can also skip over the XGETBV1 entirely.

The downside is that there's no explicit XGETBV1 check before calling
xfeatures_in_use().  But, I don't really expect the AMX->XGETBV1
dependency to go away either.
Chang S. Bae May 18, 2022, 5:20 p.m. UTC | #2
On 5/18/2022 8:41 AM, Dave Hansen wrote:
> 
> This is a pretty minor nit, but:
> 
> X86_FEATURE_XFD depends on X86_FEATURE_XGETBV1
> 
> and
> 
> X86_FEATURE_AMX_TILE depends on X86_FEATURE_XFD
> 
> via cpu_deps[].  So there is an implicit dependency all the way from AMX
> to XGETBV1.  It's also not patently obvious what X86_FEATURE_XGETBV1 has
> to do with the rest of the if().
> 
> Would this make more logical sense to folks?
> 
> 	/* Note: AMX_TILE being enabled implies XGETBV1 support */
> 	if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) &&
> 	    (xfeatures_in_use() & XFEATURE_MASK_XTILE)) {
> 		tile_release();
> 		fpregs_deactivate(&current->thread.fpu);
> 	}

With the note, I guess people will have no problem with AMX->XGETBV1. 
But I would leave this question to others who can tell.

> 
> That also has a nice side effect that non-AMX systems will get to use a
> static branch and can also skip over the XGETBV1 entirely.

Yes, but FWIW, as it is non-architectural, the function should be 
consumed only by drivers for AMX systems.

> 
> The downside is that there's no explicit XGETBV1 check before calling
> xfeatures_in_use().  But, I don't really expect the AMX->XGETBV1
> dependency to go away either.

Yes, as long as AMX is wanted as a dynamic feature I think.

Thanks,
Chang
Dave Hansen May 18, 2022, 5:27 p.m. UTC | #3
On 5/18/22 10:20, Chang S. Bae wrote:
>> That also has a nice side effect that non-AMX systems will get to use a
>> static branch and can also skip over the XGETBV1 entirely.
> 
> Yes, but FWIW, as it is non-architectural, the function should be
> consumed only by drivers for AMX systems.

Oh, that's a good point.  The new flag is set only on Sapphire Rapids
systems so the new code is only called there as well.  I guess it would
only matter if we end up having systems that need this where AMX isn't
universally available, like if it were fused on or off on certain SKUs
(I have no idea if anyone is actually doing this).