mbox series

[v2,00/38] x86/cpu: Rework the topology evaluation

Message ID 20230728105650.565799744@linutronix.de
Headers show
Series x86/cpu: Rework the topology evaluation | expand

Message

Thomas Gleixner July 28, 2023, 12:12 p.m. UTC
Hi!

This is the follow up to V1:

  https://lore.kernel.org/lkml/20230724155329.474037902@linutronix.de

which addresses the review feedback and some minor fallout I observed in my
testing of the work based on top.

TLDR:

This reworks the way how topology information is evaluated via CPUID
in preparation for a larger topology management overhaul to address
shortcomings of the current code vs. hybrid systems and systems which make
use of the extended topology domains in leaf 0x1f. Aside of that it's an
overdue spring cleaning to get rid of accumulated layers of duct tape and
haywire.

What changed vs. V1:

  - Fixed an issue vs. the logical die/pkg management as the current
    code (ab)uses cpuinfo for persistant storage.

  - Consolidated APIC ID usage on u32 and ditched the u16 limitation

  - Addressed the review feedback from Peter and Arjan

  - Added a new patch which gets rid of XENPV fiddling in the cpuinfo
    state. That needs some testing on XENPV obviously. The relevant
    patches are #22 and #37

I did not pick up any of the tested by tags yet. I hope people can run it
once more. Neither did I add the Ack from Peter.

The series is based on the APIC cleanup series:

  https://lore.kernel.org/lkml/20230724131206.500814398@linutronix.de

and also available on top of that from git:

 git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2

Thanks,

	tglx
---
 Documentation/arch/x86/topology.rst       |   12 -
 a/arch/x86/kernel/cpu/topology.c          |  168 ---------------------
 arch/x86/events/amd/core.c                |    2 
 arch/x86/events/amd/uncore.c              |    2 
 arch/x86/events/intel/uncore.c            |    2 
 arch/x86/hyperv/hv_vtl.c                  |    2 
 arch/x86/include/asm/apic.h               |   32 +---
 arch/x86/include/asm/cacheinfo.h          |    3 
 arch/x86/include/asm/cpuid.h              |   32 ++++
 arch/x86/include/asm/mpspec.h             |    2 
 arch/x86/include/asm/processor.h          |   60 +++++--
 arch/x86/include/asm/smp.h                |    4 
 arch/x86/include/asm/topology.h           |   51 +++++-
 arch/x86/include/asm/x86_init.h           |    2 
 arch/x86/kernel/acpi/boot.c               |    4 
 arch/x86/kernel/amd_nb.c                  |    8 -
 arch/x86/kernel/apic/apic.c               |   14 -
 arch/x86/kernel/apic/apic_common.c        |    4 
 arch/x86/kernel/apic/apic_flat_64.c       |   13 -
 arch/x86/kernel/apic/apic_noop.c          |    9 -
 arch/x86/kernel/apic/apic_numachip.c      |   21 --
 arch/x86/kernel/apic/bigsmp_32.c          |   10 -
 arch/x86/kernel/apic/local.h              |    6 
 arch/x86/kernel/apic/probe_32.c           |   10 -
 arch/x86/kernel/apic/x2apic_cluster.c     |    1 
 arch/x86/kernel/apic/x2apic_phys.c        |   10 -
 arch/x86/kernel/apic/x2apic_uv_x.c        |   67 +-------
 arch/x86/kernel/cpu/Makefile              |    5 
 arch/x86/kernel/cpu/amd.c                 |  156 --------------------
 arch/x86/kernel/cpu/cacheinfo.c           |   51 ++----
 arch/x86/kernel/cpu/centaur.c             |    4 
 arch/x86/kernel/cpu/common.c              |  111 +-------------
 arch/x86/kernel/cpu/cpu.h                 |   14 +
 arch/x86/kernel/cpu/hygon.c               |  133 -----------------
 arch/x86/kernel/cpu/intel.c               |   38 ----
 arch/x86/kernel/cpu/mce/amd.c             |    4 
 arch/x86/kernel/cpu/mce/apei.c            |    4 
 arch/x86/kernel/cpu/mce/core.c            |    4 
 arch/x86/kernel/cpu/mce/inject.c          |    7 
 arch/x86/kernel/cpu/proc.c                |    8 -
 arch/x86/kernel/cpu/zhaoxin.c             |   18 --
 arch/x86/kernel/kvm.c                     |    6 
 arch/x86/kernel/sev.c                     |    2 
 arch/x86/kernel/smpboot.c                 |   97 +++++++-----
 arch/x86/kernel/vsmp_64.c                 |   13 -
 arch/x86/mm/amdtopology.c                 |   35 ++--
 arch/x86/mm/numa.c                        |    4 
 arch/x86/xen/apic.c                       |   14 -
 arch/x86/xen/smp_pv.c                     |    3 
 b/arch/x86/kernel/cpu/debugfs.c           |   97 ++++++++++++
 b/arch/x86/kernel/cpu/topology.h          |   51 ++++++
 b/arch/x86/kernel/cpu/topology_amd.c      |  179 +++++++++++++++++++++++
 b/arch/x86/kernel/cpu/topology_common.c   |  233 ++++++++++++++++++++++++++++++
 b/arch/x86/kernel/cpu/topology_ext.c      |  136 +++++++++++++++++
 drivers/edac/amd64_edac.c                 |    4 
 drivers/edac/mce_amd.c                    |    4 
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |    2 
 drivers/hwmon/fam15h_power.c              |    7 
 drivers/scsi/lpfc/lpfc_init.c             |    8 -
 drivers/virt/acrn/hsm.c                   |    2 
 60 files changed, 1049 insertions(+), 956 deletions(-)

Comments

Sohil Mehta July 28, 2023, 11:57 p.m. UTC | #1
> +++ b/arch/x86/kernel/cpu/debugfs.c
> @@ -0,0 +1,58 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/debugfs.h>
> +
> +#include <asm/apic.h>
> +#include <asm/processor.h>
> +

I ran checkpatch through the patches and it lists a bunch of errors and
warnings. Most of them are due to issues in the existing code which is
probably not worth fixing.

However, I found a couple of recommendations which might be of interest.

1) Using linux headers instead of the asm ones. Namely,

> Consider using #include <linux/processor.h> instead of <asm/processor.h>
> Consider using #include <linux/topology.h> instead of <asm/topology.h>
> Consider using #include <linux/smp.h> instead of <asm/smp.h>

2) Macros with multiple statements should be enclosed in a do - while loop
Sohil Mehta July 29, 2023, 12:02 a.m. UTC | #2
On 7/28/2023 5:13 AM, Thomas Gleixner wrote:
> Topology evaluation is a complete disaster and impenetrable mess. It's
> scattered all over the place with some vendor implementatins doing early

s/implementatins/implementations

> +static void parse_topology(struct topo_scan *tscan, bool early)
> +{
> +	const struct cpuinfo_topology topo_defaults = {
> +		.cu_id			= 0xff,
> +		.llc_id			= BAD_APICID,
> +		.l2c_id			= BAD_APICID,
> +	};
> +	struct cpuinfo_x86 *c = tscan->c;
> +	struct {
> +		u32	unused0		: 16,
> +			nproc		:  8,
> +			apicid		:  8;
> +	} ebx;
> +
> +	c->topo = topo_defaults;
> +
> +	if (fake_topology(tscan))
> +	    return;
> +

Spaces used for indenting "return" instead of a tab.
Sohil Mehta July 29, 2023, 12:07 a.m. UTC | #3
s/acessor/accessor
Arjan van de Ven July 31, 2023, 1:47 p.m. UTC | #4
On 7/30/2023 9:05 PM, Michael Kelley (LINUX) wrote:
> Does anyone have suggestions on a different way to handle
> this that's better than the above diff?  Other thoughts?

how badly do you need xapic ? Meaning, can x2apic just be used instead always
Andrew Cooper July 31, 2023, 2:08 p.m. UTC | #5
On 31/07/2023 2:47 pm, Arjan van de Ven wrote:
> On 7/30/2023 9:05 PM, Michael Kelley (LINUX) wrote:
>> Does anyone have suggestions on a different way to handle
>> this that's better than the above diff?  Other thoughts?
>
> how badly do you need xapic ? Meaning, can x2apic just be used instead
> always

x2APIC under virt is a problem if you don't want to fully emulate an
IOMMU just for int-remapping purposes.

You don't know a-priori whether a particular guest kernel knows about
e.g. the rsvd bit trick in IO-APIC RTEs to allow a 32k destination id.

The only generally compatible way is to start in xAPIC mode, leave all
the enumeration hints around which say "really really please switch into
x2APIC mode", and hope the kernel does.

~Andrew
Thomas Gleixner July 31, 2023, 3:38 p.m. UTC | #6
On Mon, Jul 31 2023 at 15:27, Peter Zijlstra wrote:
> On Mon, Jul 31, 2023 at 02:34:39PM +0200, Thomas Gleixner wrote:
>> This collides massively with the other work I'm doing, which uses the
>> MADT provided information to actually evaluate various topology related
>> things upfront and later during bringup. Thats badly needed because lots
>> of todays infrastructure is based on heuristics and guesswork.
>> 
>> But it seems I wasted a month on reworking all of this just to be
>> stopped cold in the tracks by completely undocumented and unnecessary
>> hyper-v abuse.
>> 
>> So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
>> for topology information related to L3, then hyper-v should override the
>> cache topology mechanism and not impose this insanity on the basic
>> topology evaluation infrastructure.
>
> So I'm very tempted to suggest you continue with the topology rewrite
> and let Hyper-V keep the pieces. They're very clearly violating the SDM.
>
> Thing as they stand are untenable, the whole topology thing as it exists
> today is an untenable shitshow.
>
> Michael, is there anything you can do early (as in MADT parse early) to
> fix up the APIC-IDs?

I don't think so.

Michael, can you please provide me a table of:

   APICID (real/MADT)		APICID (CPUID)

from one of the tinker VMs please?

Thanks,

        tglx
Michael Kelley July 31, 2023, 4:10 p.m. UTC | #7
From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 8:38 AM
> 
> On Mon, Jul 31 2023 at 15:27, Peter Zijlstra wrote:
> > On Mon, Jul 31, 2023 at 02:34:39PM +0200, Thomas Gleixner wrote:
> >> This collides massively with the other work I'm doing, which uses the
> >> MADT provided information to actually evaluate various topology related
> >> things upfront and later during bringup. Thats badly needed because lots
> >> of todays infrastructure is based on heuristics and guesswork.
> >>
> >> But it seems I wasted a month on reworking all of this just to be
> >> stopped cold in the tracks by completely undocumented and unnecessary
> >> hyper-v abuse.
> >>
> >> So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
> >> for topology information related to L3, then hyper-v should override the
> >> cache topology mechanism and not impose this insanity on the basic
> >> topology evaluation infrastructure.
> >
> > So I'm very tempted to suggest you continue with the topology rewrite
> > and let Hyper-V keep the pieces. They're very clearly violating the SDM.
> >
> > Thing as they stand are untenable, the whole topology thing as it exists
> > today is an untenable shitshow.
> >
> > Michael, is there anything you can do early (as in MADT parse early) to
> > fix up the APIC-IDs?
> 
> I don't think so.
> 
> Michael, can you please provide me a table of:
> 
>    APICID (real/MADT)		APICID (CPUID)
> 
> from one of the tinker VMs please?
> 

The VM is an F72s_v2 in Azure running your patch set.  The VM has
72 vCPUs in two NUMA nodes across two physical Intel processors, with
36 vCPUs in each NUMA node.

The output is from /sys/kernel/debug/x86/topo/cpus, so the initial_apicid
is from CPUID, while the apicid is from read_apic_id() and matches the
MADT.  As expected, the two values match for the first 36 vCPUs, but differ
by 28 (decimal) for the remaining 36.

initial_apicid:      0 apicid:              0
initial_apicid:      1 apicid:              1
initial_apicid:      2 apicid:              2
initial_apicid:      3 apicid:              3
initial_apicid:      4 apicid:              4
initial_apicid:      5 apicid:              5
initial_apicid:      6 apicid:              6
initial_apicid:      7 apicid:              7
initial_apicid:      8 apicid:              8
initial_apicid:      9 apicid:              9
initial_apicid:      a apicid:              a
initial_apicid:      b apicid:              b
initial_apicid:      c apicid:              c
initial_apicid:      d apicid:              d
initial_apicid:      e apicid:              e
initial_apicid:      f apicid:              f
initial_apicid:      10 apicid:              10
initial_apicid:      11 apicid:              11
initial_apicid:      12 apicid:              12
initial_apicid:      13 apicid:              13
initial_apicid:      14 apicid:              14
initial_apicid:      15 apicid:              15
initial_apicid:      16 apicid:              16
initial_apicid:      17 apicid:              17
initial_apicid:      18 apicid:              18
initial_apicid:      19 apicid:              19
initial_apicid:      1a apicid:              1a
initial_apicid:      1b apicid:              1b
initial_apicid:      1c apicid:              1c
initial_apicid:      1d apicid:              1d
initial_apicid:      1e apicid:              1e
initial_apicid:      1f apicid:              1f
initial_apicid:      20 apicid:              20
initial_apicid:      21 apicid:              21
initial_apicid:      22 apicid:              22
initial_apicid:      23 apicid:              23
initial_apicid:      40 apicid:              24
initial_apicid:      41 apicid:              25
initial_apicid:      42 apicid:              26
initial_apicid:      43 apicid:              27
initial_apicid:      44 apicid:              28
initial_apicid:      45 apicid:              29
initial_apicid:      46 apicid:              2a
initial_apicid:      47 apicid:              2b
initial_apicid:      48 apicid:              2c
initial_apicid:      49 apicid:              2d
initial_apicid:      4a apicid:              2e
initial_apicid:      4b apicid:              2f
initial_apicid:      4c apicid:              30
initial_apicid:      4d apicid:              31
initial_apicid:      4e apicid:              32
initial_apicid:      4f apicid:              33
initial_apicid:      50 apicid:              34
initial_apicid:      51 apicid:              35
initial_apicid:      52 apicid:              36
initial_apicid:      53 apicid:              37
initial_apicid:      54 apicid:              38
initial_apicid:      55 apicid:              39
initial_apicid:      56 apicid:              3a
initial_apicid:      57 apicid:              3b
initial_apicid:      58 apicid:              3c
initial_apicid:      59 apicid:              3d
initial_apicid:      5a apicid:              3e
initial_apicid:      5b apicid:              3f
initial_apicid:      5c apicid:              40
initial_apicid:      5d apicid:              41
initial_apicid:      5e apicid:              42
initial_apicid:      5f apicid:              43
initial_apicid:      60 apicid:              44
initial_apicid:      61 apicid:              45
initial_apicid:      62 apicid:              46
initial_apicid:      63 apicid:              47

Michael
Thomas Gleixner July 31, 2023, 8:48 p.m. UTC | #8
On Mon, Jul 31 2023 at 16:10, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 8:38 AM
> The VM is an F72s_v2 in Azure running your patch set.  The VM has
> 72 vCPUs in two NUMA nodes across two physical Intel processors, with
> 36 vCPUs in each NUMA node.
>
> The output is from /sys/kernel/debug/x86/topo/cpus, so the initial_apicid
> is from CPUID, while the apicid is from read_apic_id() and matches the
> MADT.  As expected, the two values match for the first 36 vCPUs, but differ
> by 28 (decimal) for the remaining 36.
>
> initial_apicid:      0 apicid:              0
...
> initial_apicid:      23 apicid:              23

> initial_apicid:      40 apicid:              24
...
> initial_apicid:      63 apicid:              47

Is there any indication in some other CPUID leaf which lets us deduce this
wreckage?

I don't think the hypervisor space (0x40000xx) has anything helpful, but
staring at the architectural ones provided by hyper-V to the guest might
give us an hint. Can you provide a cpuid dump for the boot CPU please?

Thanks,

        tglx
Michael Kelley July 31, 2023, 9:27 p.m. UTC | #9
From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 1:49 PM
> 
> On Mon, Jul 31 2023 at 16:10, Michael Kelley wrote:
> > From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 8:38 AM
> > The VM is an F72s_v2 in Azure running your patch set.  The VM has
> > 72 vCPUs in two NUMA nodes across two physical Intel processors, with
> > 36 vCPUs in each NUMA node.
> >
> > The output is from /sys/kernel/debug/x86/topo/cpus, so the initial_apicid
> > is from CPUID, while the apicid is from read_apic_id() and matches the
> > MADT.  As expected, the two values match for the first 36 vCPUs, but differ
> > by 28 (decimal) for the remaining 36.
> >
> > initial_apicid:      0 apicid:              0
> ...
> > initial_apicid:      23 apicid:              23
> 
> > initial_apicid:      40 apicid:              24
> ...
> > initial_apicid:      63 apicid:              47
> 
> Is there any indication in some other CPUID leaf which lets us deduce this
> wreckage?

You can detect being a Hyper-V guest with leaf 0x40000000.  See Linux
kernel function ms_hyperv_platform().  But I'm not aware of anything
to indicate that a specific Hyper-V VM has the APIC numbering problem
vs. doesn't have the problem.

> 
> I don't think the hypervisor space (0x40000xx) has anything helpful, but
> staring at the architectural ones provided by hyper-V to the guest might
> give us an hint. Can you provide a cpuid dump for the boot CPU please?
> 

I'm not sure if you want the raw or decoded output.  Here's both.

Michael

# taskset -c 0 cpuid -r -1
CPU:
   0x00000000 0x00: eax=0x00000015 ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
   0x00000001 0x00: eax=0x000606a6 ebx=0x00400800 ecx=0xfeda3223 edx=0x1f8bfbff
   0x00000002 0x00: eax=0x00feff01 ebx=0x000000f0 ecx=0x00000000 edx=0x00000000
   0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000004 0x00: eax=0x7c004121 ebx=0x02c0003f ecx=0x0000003f edx=0x00000000
   0x00000004 0x01: eax=0x7c004122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000
   0x00000004 0x02: eax=0x7c004143 ebx=0x04c0003f ecx=0x000003ff edx=0x00000000
   0x00000004 0x03: eax=0x7c0fc163 ebx=0x02c0003f ecx=0x0000ffff edx=0x00000000
   0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000007 0x00: eax=0x00000000 ebx=0xd09f2fb9 ecx=0x00000000 edx=0x00000400
   0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000
   0x0000000b 0x01: eax=0x00000006 ebx=0x00000040 ecx=0x00000201 edx=0x00000000
   0x0000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x00: eax=0x000000e7 ebx=0x00000a80 ecx=0x00000a80 edx=0x00000000
   0x0000000d 0x01: eax=0x0000000b ebx=0x00000980 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x05: eax=0x00000040 ebx=0x00000440 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x06: eax=0x00000200 ebx=0x00000480 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x07: eax=0x00000400 ebx=0x00000680 ecx=0x00000000 edx=0x00000000
   0x0000000e 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000f 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000010 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000012 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000014 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000015 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000000 0x00: eax=0x4000000a ebx=0x7263694d ecx=0x666f736f edx=0x76482074
   0x40000001 0x00: eax=0x31237648 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000002 0x00: eax=0x00004f7c ebx=0x000a0000 ecx=0x00000001 edx=0x000005b6
   0x40000003 0x00: eax=0x00002e7f ebx=0x003b8030 ecx=0x00000002 edx=0x000ed7b2
   0x40000004 0x00: eax=0x00064e24 ebx=0x00000fff ecx=0x0000002e edx=0x00000000
   0x40000005 0x00: eax=0x000000f0 ebx=0x00000400 ecx=0x00005d00 edx=0x00000000
   0x40000006 0x00: eax=0x0000000f ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x4000000a 0x00: eax=0x000e0101 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x20000000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000000 0x00: eax=0x80000008 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000001 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000121 edx=0x2c100800
   0x80000002 0x00: eax=0x65746e49 ebx=0x2952286c ecx=0x6f655820 edx=0x2952286e
   0x80000003 0x00: eax=0x616c5020 ebx=0x756e6974 ecx=0x3338206d edx=0x20433037
   0x80000004 0x00: eax=0x20555043 ebx=0x2e322040 ecx=0x48473038 edx=0x0000007a
   0x80000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x01006040 edx=0x00000000
   0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000008 0x00: eax=0x0000302e ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80860000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0xc0000000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000

CPU:
   vendor_id = "GenuineIntel"
   version information (1/eax):
      processor type  = primary processor (0)
      family          = 0x6 (6)
      model           = 0xa (10)
      stepping id     = 0x6 (6)
      extended family = 0x0 (0)
      extended model  = 0x6 (6)
      (family synth)  = 0x6 (6)
      (model synth)   = 0x6a (106)
      (simple synth)  = Intel Core (Ice Lake) [Sunny Cove] {Sunny Cove}, 10nm
   miscellaneous (1/ebx):
      process local APIC physical ID = 0x0 (0)
      cpu count                      = 0x40 (64)
      CLFLUSH line size              = 0x8 (8)
      brand index                    = 0x0 (0)
   brand id = 0x00 (0): unknown
   feature information (1/edx):
      x87 FPU on chip                        = true
      VME: virtual-8086 mode enhancement     = true
      DE: debugging extensions               = true
      PSE: page size extensions              = true
      TSC: time stamp counter                = true
      RDMSR and WRMSR support                = true
      PAE: physical address extensions       = true
      MCE: machine check exception           = true
      CMPXCHG8B inst.                        = true
      APIC on chip                           = true
      SYSENTER and SYSEXIT                   = true
      MTRR: memory type range registers      = true
      PTE global bit                         = true
      MCA: machine check architecture        = true
      CMOV: conditional move/compare instr   = true
      PAT: page attribute table              = true
      PSE-36: page size extension            = true
      PSN: processor serial number           = false
      CLFLUSH instruction                    = true
      DS: debug store                        = false
      ACPI: thermal monitor and clock ctrl   = false
      MMX Technology                         = true
      FXSAVE/FXRSTOR                         = true
      SSE extensions                         = true
      SSE2 extensions                        = true
      SS: self snoop                         = true
      hyper-threading / multi-core supported = true
      TM: therm. monitor                     = false
      IA64                                   = false
      PBE: pending break event               = false
   feature information (1/ecx):
      PNI/SSE3: Prescott New Instructions     = true
      PCLMULDQ instruction                    = true
      DTES64: 64-bit debug store              = false
      MONITOR/MWAIT                           = false
      CPL-qualified debug store               = false
      VMX: virtual machine extensions         = true
      SMX: safer mode extensions              = false
      Enhanced Intel SpeedStep Technology     = false
      TM2: thermal monitor 2                  = false
      SSSE3 extensions                        = true
      context ID: adaptive or shared L1 data  = false
      SDBG: IA32_DEBUG_INTERFACE              = false
      FMA instruction                         = true
      CMPXCHG16B instruction                  = true
      xTPR disable                            = false
      PDCM: perfmon and debug                 = false
      PCID: process context identifiers       = true
      DCA: direct cache access                = false
      SSE4.1 extensions                       = true
      SSE4.2 extensions                       = true
      x2APIC: extended xAPIC support          = false
      MOVBE instruction                       = true
      POPCNT instruction                      = true
      time stamp counter deadline             = false
      AES instruction                         = true
      XSAVE/XSTOR states                      = true
      OS-enabled XSAVE/XSTOR                  = true
      AVX: advanced vector extensions         = true
      F16C half-precision convert instruction = true
      RDRAND instruction                      = true
      hypervisor guest status                 = true
   cache and TLB information (2):
      0xff: cache data is in CPUID leaf 4
      0xfe: TLB data is in CPUID leaf 0x18
      0xf0: 64 byte prefetching
   processor serial number = 0006-06A6-0000-0000-0000-0000
   deterministic cache parameters (4):
      --- cache 0 ---
      cache type                           = data cache (1)
      cache level                          = 0x1 (1)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1 (1)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0xc (12)
      number of sets                       = 0x40 (64)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 64
      (size synth)                         = 49152 (48 KB)
      --- cache 1 ---
      cache type                           = instruction cache (2)
      cache level                          = 0x1 (1)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1 (1)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0x8 (8)
      number of sets                       = 0x40 (64)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 64
      (size synth)                         = 32768 (32 KB)
      --- cache 2 ---
      cache type                           = unified cache (3)
      cache level                          = 0x2 (2)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1 (1)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0x14 (20)
      number of sets                       = 0x400 (1024)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 1024
      (size synth)                         = 1310720 (1.2 MB)
      --- cache 3 ---
      cache type                           = unified cache (3)
      cache level                          = 0x3 (3)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x3f (63)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0xc (12)
      number of sets                       = 0x10000 (65536)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 65536
      (size synth)                         = 50331648 (48 MB)
   MONITOR/MWAIT (5):
      smallest monitor-line size (bytes)       = 0x0 (0)
      largest monitor-line size (bytes)        = 0x0 (0)
      enum of Monitor-MWAIT exts supported     = false
      supports intrs as break-event for MWAIT  = false
      number of C0 sub C-states using MWAIT    = 0x0 (0)
      number of C1 sub C-states using MWAIT    = 0x0 (0)
      number of C2 sub C-states using MWAIT    = 0x0 (0)
      number of C3 sub C-states using MWAIT    = 0x0 (0)
      number of C4 sub C-states using MWAIT    = 0x0 (0)
      number of C5 sub C-states using MWAIT    = 0x0 (0)
      number of C6 sub C-states using MWAIT    = 0x0 (0)
      number of C7 sub C-states using MWAIT    = 0x0 (0)
   Thermal and Power Management Features (6):
      digital thermometer                     = false
      Intel Turbo Boost Technology            = false
      ARAT always running APIC timer          = false
      PLN power limit notification            = false
      ECMD extended clock modulation duty     = false
      PTM package thermal management          = false
      HWP base registers                      = false
      HWP notification                        = false
      HWP activity window                     = false
      HWP energy performance preference       = false
      HWP package level request               = false
      HDC base registers                      = false
      Intel Turbo Boost Max Technology 3.0    = false
      HWP capabilities                        = false
      HWP PECI override                       = false
      flexible HWP                            = false
      IA32_HWP_REQUEST MSR fast access mode   = false
      HW_FEEDBACK                             = false
      ignoring idle logical processor HWP req = false
      digital thermometer thresholds          = 0x0 (0)
      hardware coordination feedback          = false
      ACNT2 available                         = false
      performance-energy bias capability      = false
      performance capability reporting        = false
      energy efficiency capability reporting  = false
      size of feedback struct (4KB pages)     = 0x0 (0)
      index of CPU's row in feedback struct   = 0x0 (0)
   extended feature flags (7):
      FSGSBASE instructions                    = true
      IA32_TSC_ADJUST MSR supported            = false
      SGX: Software Guard Extensions supported = false
      BMI1 instructions                        = true
      HLE hardware lock elision                = true
      AVX2: advanced vector extensions 2       = true
      FDP_EXCPTN_ONLY                          = false
      SMEP supervisor mode exec protection     = true
      BMI2 instructions                        = true
      enhanced REP MOVSB/STOSB                 = true
      INVPCID instruction                      = true
      RTM: restricted transactional memory     = true
      RDT-CMT/PQoS cache monitoring            = false
      deprecated FPU CS/DS                     = true
      MPX: intel memory protection extensions  = false
      RDT-CAT/PQE cache allocation             = false
      AVX512F: AVX-512 foundation instructions = true
      AVX512DQ: double & quadword instructions = true
      RDSEED instruction                       = true
      ADX instructions                         = true
      SMAP: supervisor mode access prevention  = true
      AVX512IFMA: fused multiply add           = false
      PCOMMIT instruction                      = false
      CLFLUSHOPT instruction                   = true
      CLWB instruction                         = false
      Intel processor trace                    = false
      AVX512PF: prefetch instructions          = false
      AVX512ER: exponent & reciprocal instrs   = false
      AVX512CD: conflict detection instrs      = true
      SHA instructions                         = false
      AVX512BW: byte & word instructions       = true
      AVX512VL: vector length                  = true
      PREFETCHWT1                              = false
      AVX512VBMI: vector byte manipulation     = false
      UMIP: user-mode instruction prevention   = false
      PKU protection keys for user-mode        = false
      OSPKE CR4.PKE and RDPKRU/WRPKRU          = false
      WAITPKG instructions                     = false
      AVX512_VBMI2: byte VPCOMPRESS, VPEXPAND  = false
      CET_SS: CET shadow stack                 = false
      GFNI: Galois Field New Instructions      = false
      VAES instructions                        = false
      VPCLMULQDQ instruction                   = false
      AVX512_VNNI: neural network instructions = false
      AVX512_BITALG: bit count/shiffle         = false
      TME: Total Memory Encryption             = false
      AVX512: VPOPCNTDQ instruction            = false
      5-level paging                           = false
      BNDLDX/BNDSTX MAWAU value in 64-bit mode = 0x0 (0)
      RDPID: read processor D supported        = false
      CLDEMOTE supports cache line demote      = false
      MOVDIRI instruction                      = false
      MOVDIR64B instruction                    = false
      ENQCMD instruction                       = false
      SGX_LC: SGX launch config supported      = false
      AVX512_4VNNIW: neural network instrs     = false
      AVX512_4FMAPS: multiply acc single prec  = false
      fast short REP MOV                       = false
      AVX512_VP2INTERSECT: intersect mask regs = false
      VERW md-clear microcode support          = true
      hybrid part                              = false
      PCONFIG instruction                      = false
      CET_IBT: CET indirect branch tracking    = false
      IBRS/IBPB: indirect branch restrictions  = false
      STIBP: 1 thr indirect branch predictor   = false
      L1D_FLUSH: IA32_FLUSH_CMD MSR            = false
      IA32_ARCH_CAPABILITIES MSR               = false
      IA32_CORE_CAPABILITIES MSR               = false
      SSBD: speculative store bypass disable   = false
   Direct Cache Access Parameters (9):
      PLATFORM_DCA_CAP MSR bits = 0
   Architecture Performance Monitoring Features (0xa/eax):
      version ID                               = 0x0 (0)
      number of counters per logical processor = 0x0 (0)
      bit width of counter                     = 0x0 (0)
      length of EBX bit vector                 = 0x0 (0)
   Architecture Performance Monitoring Features (0xa/ebx):
      core cycle event not available           = false
      instruction retired event not available  = false
      reference cycles event not available     = false
      last-level cache ref event not available = false
      last-level cache miss event not avail    = false
      branch inst retired event not available  = false
      branch mispred retired event not avail   = false
   Architecture Performance Monitoring Features (0xa/edx):
      number of fixed counters    = 0x0 (0)
      bit width of fixed counters = 0x0 (0)
      anythread deprecation       = false
   x2APIC features / processor topology (0xb):
      extended APIC ID                      = 0
      --- level 0 ---
      level number                          = 0x0 (0)
      level type                            = thread (1)
      bit width of level                    = 0x1 (1)
      number of logical processors at level = 0x2 (2)
      --- level 1 ---
      level number                          = 0x1 (1)
      level type                            = core (2)
      bit width of level                    = 0x6 (6)
      number of logical processors at level = 0x40 (64)
   XSAVE features (0xd/0):
      XCR0 lower 32 bits valid bit field mask = 0x000000e7
      XCR0 upper 32 bits valid bit field mask = 0x00000000
         XCR0 supported: x87 state            = true
         XCR0 supported: SSE state            = true
         XCR0 supported: AVX state            = true
         XCR0 supported: MPX BNDREGS          = false
         XCR0 supported: MPX BNDCSR           = false
         XCR0 supported: AVX-512 opmask       = true
         XCR0 supported: AVX-512 ZMM_Hi256    = true
         XCR0 supported: AVX-512 Hi16_ZMM     = true
         IA32_XSS supported: PT state         = false
         XCR0 supported: PKRU state           = false
         XCR0 supported: CET_U state          = false
         XCR0 supported: CET_S state          = false
         IA32_XSS supported: HDC state        = false
      bytes required by fields in XCR0        = 0x00000a80 (2688)
      bytes required by XSAVE/XRSTOR area     = 0x00000a80 (2688)
   XSAVE features (0xd/1):
      XSAVEOPT instruction                        = true
      XSAVEC instruction                          = true
      XGETBV instruction                          = false
      XSAVES/XRSTORS instructions                 = true
      SAVE area size in bytes                     = 0x00000980 (2432)
      IA32_XSS lower 32 bits valid bit field mask = 0x00000000
      IA32_XSS upper 32 bits valid bit field mask = 0x00000000
   AVX/YMM features (0xd/2):
      AVX/YMM save state byte size             = 0x00000100 (256)
      AVX/YMM save state byte offset           = 0x00000240 (576)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   AVX-512 opmask features (0xd/5):
      AVX-512 opmask save state byte size      = 0x00000040 (64)
      AVX-512 opmask save state byte offset    = 0x00000440 (1088)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   AVX-512 ZMM_Hi256 features (0xd/6):
      AVX-512 ZMM_Hi256 save state byte size   = 0x00000200 (512)
      AVX-512 ZMM_Hi256 save state byte offset = 0x00000480 (1152)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   AVX-512 Hi16_ZMM features (0xd/7):
      AVX-512 Hi16_ZMM save state byte size    = 0x00000400 (1024)
      AVX-512 Hi16_ZMM save state byte offset  = 0x00000680 (1664)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   Quality of Service Monitoring Resource Type (0xf/0):
      Maximum range of RMID = 0
      supports L3 cache QoS monitoring = false
   Resource Director Technology Allocation (0x10/0):
      L3 cache allocation technology supported = false
      L2 cache allocation technology supported = false
      memory bandwidth allocation supported    = false
   0x00000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   Software Guard Extensions (SGX) capability (0x12/0):
      SGX1 supported                         = false
      SGX2 supported                         = false
      SGX ENCLV E*VIRTCHILD, ESETCONTEXT     = false
      SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false
      MISCSELECT.EXINFO supported: #PF & #GP = false
      MISCSELECT.CPINFO supported: #CP       = false
      MaxEnclaveSize_Not64 (log2)            = 0x0 (0)
      MaxEnclaveSize_64 (log2)               = 0x0 (0)
   0x00000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   Intel Processor Trace (0x14):
      IA32_RTIT_CR3_MATCH is accessible      = false
      configurable PSB & cycle-accurate      = false
      IP & TraceStop filtering; PT preserve  = false
      MTC timing packet; suppress COFI-based = false
      PTWRITE support                        = false
      power event trace support              = false
      ToPA output scheme support         = false
      ToPA can hold many output entries  = false
      single-range output scheme support = false
      output to trace transport          = false
      IP payloads have LIP values & CS   = false
   Time Stamp Counter/Core Crystal Clock Information (0x15):
      TSC/clock ratio = 0/0
      nominal core crystal clock = 0 Hz
   hypervisor_id = "Microsoft Hv"
   hypervisor interface identification (0x40000001/eax):
      version = "Hv#1"
   hypervisor system identity (0x40000002):
      build          = 20348
      version        = 10.0
      service pack   = 1
      service branch = 0
      service number = 1462
   hypervisor feature identification (0x40000003/eax):
      VP run time                      = true
      partition reference counter      = true
      basic synIC MSRs                 = true
      synthetic timer MSRs             = true
      APIC access MSRs                 = true
      hypercall MSRs                   = true
      access virtual process index MSR = true
      virtual system reset MSR         = false
      map/unmap statistics pages MSR   = false
      reference TSC access             = true
      guest idle state MSR             = true
      TSC/APIC frequency MSRs          = true
      guest debugging MSRs             = false
   hypervisor partition creation flags (0x40000003/ebx):
      CreatePartitions         = false
      AccessPartitionId        = false
      AccessMemoryPool         = false
      AdjustMessageBuffers     = false
      PostMessages             = true
      SignalEvents             = true
      CreatePort               = false
      ConnectPort              = false
      AccessStats              = false
      Debugging                = false
      CPUManagement            = false
      ConfigureProfiler        = false
      AccessVSM                = true
      AccessVpRegisters        = true
      EnableExtendedHypercalls = true
      StartVirtualProcessor    = true
   hypervisor power management features (0x40000003/ecx):
      maximum process power state = 0x2 (2)
   hypervisor feature identification (0x40000003/edx):
      MWAIT available                          = false
      guest debugging support available        = true
      performance monitor support available    = false
      CPU dynamic partitioning events avail    = false
      hypercall XMM input parameters available = true
      virtual guest idle state available       = true
      hypervisor sleep state available         = false
      query NUMA distance available            = true
      determine timer frequency available      = true
      inject synthetic machine check available = true
      guest crash MSRs available               = true
      debug MSRs available                     = false
      NPIEP available                          = true
      disable hypervisor available             = false
      extended gva ranges for flush virt addrs = true
      hypercall XMM register return available  = true
      sint polling mode available              = true
      hypercall MSR lock available             = true
      use direct synthetic timers              = true
   hypervisor recommendations (0x40000004/eax):
      use hypercalls for AS switches        = false
      use hypercalls for local TLB flushes  = false
      use hypercalls for remote TLB flushes = true
      use MSRs to access EOI, ICR, TPR      = false
      use MSRs to initiate system RESET     = false
      use relaxed timing                    = true
      use DMA remapping                     = false
      use interrupt remapping               = false
      use x2APIC MSRs                       = false
      deprecate AutoEOI                     = true
      use SyntheticClusterIpi hypercall     = true
      use ExProcessorMasks                  = true
      hypervisor is nested with Hyper-V     = false
      use INT for MBEC system calls         = false
      use enlightened VMCS interface        = true
      maximum number of spinlock retry attempts = 0xfff (4095)
   hypervisor implementation limits (0x40000005):
      maximum number of virtual processors                       = 0xf0 (240)
      maximum number of logical processors                       = 0x400 (1024)
      maximum number of physical interrupt vectors for remapping = 0x5d00 (23808)
   hypervisor hardware features used (0x40000006/eax):
      APIC overlay assist              = true
      MSR bitmaps                      = true
      performance counters             = true
      second-level address translation = true
      DMA remapping                    = false
      interrupt remapping              = false
      memory patrol scrubber           = false
      DMA protection                   = false
      HPET requested                   = false
      synthetic timers are volatile    = false
   hypervisor root partition enlightenments (0x40000007):
      StartLogicalProcessor      = false
      CreateRootvirtualProcessor = false
      ProcessorPowerManagement = false
      MwaitIdleStates          = false
      LogicalProcessorIdling   = false
   hypervisor shared virtual memory (0x40000008):
      SvmSupported            = false
      MaxPasidSpacePasidCount = 0x0 (0)
   hypervisor nested hypervisor features (0x40000009):
      AccessSynicRegs               = false
      AccessIntrCtrlRegs            = false
      AccessHypercallMsrs           = false
      AccessVpIndex                 = false
      AccessReenlightenmentControls = false
      XmmRegistersForFastHypercallAvailable = false
      FastHypercallOutputAvailable          = false
      SintPoillingModeAvailable             = false
   hypervisor nested virtualization features (0x4000000a):
      enlightened VMCS version (low)          = 0x1 (1)
      enlightened VMCS version (high)         = 0x1 (1)
      direct virtual flush hypercalls support = true
      HvFlushGuestPhysicalAddress* hypercalls = true
      enlightened MSR bitmap support          = true
   extended feature flags (0x80000001/edx):
      SYSCALL and SYSRET instructions        = true
      execution disable                      = true
      1-GB large page support                = true
      RDTSCP                                 = true
      64-bit extensions technology available = true
   Intel feature flags (0x80000001/ecx):
      LAHF/SAHF supported in 64-bit mode     = true
      LZCNT advanced bit manipulation        = true
      3DNow! PREFETCH/PREFETCHW instructions = true
   brand = "Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz"
   L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax):
      instruction # entries     = 0x0 (0)
      instruction associativity = 0x0 (0)
      data # entries            = 0x0 (0)
      data associativity        = 0x0 (0)
   L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx):
      instruction # entries     = 0x0 (0)
      instruction associativity = 0x0 (0)
      data # entries            = 0x0 (0)
      data associativity        = 0x0 (0)
   L1 data cache information (0x80000005/ecx):
      line size (bytes) = 0x0 (0)
      lines per tag     = 0x0 (0)
      associativity     = 0x0 (0)
      size (KB)         = 0x0 (0)
   L1 instruction cache information (0x80000005/edx):
      line size (bytes) = 0x0 (0)
      lines per tag     = 0x0 (0)
      associativity     = 0x0 (0)
      size (KB)         = 0x0 (0)
   L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax):
      instruction # entries     = 0x0 (0)
      instruction associativity = L2 off (0)
      data # entries            = 0x0 (0)
      data associativity        = L2 off (0)
   L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx):
      instruction # entries     = 0x0 (0)
      instruction associativity = L2 off (0)
      data # entries            = 0x0 (0)
      data associativity        = L2 off (0)
   L2 unified cache information (0x80000006/ecx):
      line size (bytes) = 0x40 (64)
      lines per tag     = 0x0 (0)
      associativity     = 8-way (6)
      size (KB)         = 0x100 (256)
   L3 cache information (0x80000006/edx):
      line size (bytes)     = 0x0 (0)
      lines per tag         = 0x0 (0)
      associativity         = L2 off (0)
      size (in 512KB units) = 0x0 (0)
   RAS Capability (0x80000007/ebx):
      MCA overflow recovery support = false
      SUCCOR support                = false
      HWA: hardware assert support  = false
      scalable MCA support          = false
   Advanced Power Management Features (0x80000007/ecx):
      CmpUnitPwrSampleTimeRatio = 0x0 (0)
   Advanced Power Management Features (0x80000007/edx):
      TS: temperature sensing diode           = false
      FID: frequency ID control               = false
      VID: voltage ID control                 = false
      TTP: thermal trip                       = false
      TM: thermal monitor                     = false
      STC: software thermal control           = false
      100 MHz multiplier control              = false
      hardware P-State control                = false
      TscInvariant                            = false
      CPB: core performance boost             = false
      read-only effective frequency interface = false
      processor feedback interface            = false
      APM power reporting                     = false
      connected standby                       = false
      RAPL: running average power limit       = false
   Physical Address and Linear Address Size (0x80000008/eax):
      maximum physical address bits         = 0x2e (46)
      maximum linear (virtual) address bits = 0x30 (48)
      maximum guest physical address bits   = 0x0 (0)
   Extended Feature Extensions ID (0x80000008/ebx):
      CLZERO instruction                       = false
      instructions retired count support       = false
      always save/restore error pointers       = false
      RDPRU instruction                        = false
      memory bandwidth enforcement             = false
      WBNOINVD instruction                     = false
      IBPB: indirect branch prediction barrier = false
      IBRS: indirect branch restr speculation  = false
      STIBP: 1 thr indirect branch predictor   = false
      STIBP always on preferred mode           = false
      ppin processor id number supported       = false
      SSBD: speculative store bypass disable   = false
      virtualized SSBD                         = false
      SSBD fixed in hardware                   = false
   Size Identifiers (0x80000008/ecx):
      number of CPU cores                 = 0x1 (1)
      ApicIdCoreIdSize                    = 0x0 (0)
      performance time-stamp counter size = 0x0 (0)
   Feature Extended Size (0x80000008/edx):
      RDPRU instruction max input support = 0x0 (0)
   (multi-processing synth) = multi-core (c=32), hyper-threaded (t=2)
   (multi-processing method) = Intel leaf 0xb
   (APIC widths synth): CORE_width=6 SMT_width=1
   (APIC synth): PKG_ID=0 CORE_ID=0 SMT_ID=0
   (uarch synth) = Intel Sunny Cove {Sunny Cove}, 10nm
   (synth) = Intel Core (Ice Lake) [Sunny Cove] {Sunny Cove}, 10nm
Thomas Gleixner July 31, 2023, 10:12 p.m. UTC | #10
On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 1:49 PM
>> Is there any indication in some other CPUID leaf which lets us deduce this
>> wreckage?
>
> You can detect being a Hyper-V guest with leaf 0x40000000.  See Linux
> kernel function ms_hyperv_platform().  But I'm not aware of anything
> to indicate that a specific Hyper-V VM has the APIC numbering problem
> vs. doesn't have the problem.

That's what I said :) here:

>> I don't think the hypervisor space (0x40000xx) has anything helpful, but
>> staring at the architectural ones provided by hyper-V to the guest might
>> give us an hint. Can you provide a cpuid dump for the boot CPU please?
>> 
>
> I'm not sure if you want the raw or decoded output.  Here's both.

Either way is fine.

Clearly the hyper-v BIOS people put a lot of thoughts into this:

>    x2APIC features / processor topology (0xb):
>       extended APIC ID                      = 0
>       --- level 0 ---
>       level number                          = 0x0 (0)
>       level type                            = thread (1)
>       bit width of level                    = 0x1 (1)
>       number of logical processors at level = 0x2 (2)
>       --- level 1 ---
>       level number                          = 0x1 (1)
>       level type                            = core (2)
>       bit width of level                    = 0x6 (6)
>       number of logical processors at level = 0x40 (64)

FAIL:                                           ^^^^^

While that field is not meant for topology evaluation it is at least
expected to tell the actual number of logical processors at that level
which are actually available. 

The CPUID APIC ID aka initial_apicid clearly tells that the topology has
36 logical CPUs in package 0 and 36 in package 1 according to your
table.

On real hardware this looks like this:

      --- level 1 ---
      level number                          = 0x1 (1)
      level type                            = core (2)
      bit width of level                    = 0x6 (6)
      number of logical processors at level = 0x38 (56)

Which corresponds to reality and is consistent. But sure, consistency is
overrated.

Thanks,

        tglx
Thomas Gleixner Aug. 1, 2023, 10:25 p.m. UTC | #11
Michael!

On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> Clearly the hyper-v BIOS people put a lot of thoughts into this:
>
>>    x2APIC features / processor topology (0xb):
>>       extended APIC ID                      = 0
>>       --- level 0 ---
>>       level number                          = 0x0 (0)
>>       level type                            = thread (1)
>>       bit width of level                    = 0x1 (1)
>>       number of logical processors at level = 0x2 (2)
>>       --- level 1 ---
>>       level number                          = 0x1 (1)
>>       level type                            = core (2)
>>       bit width of level                    = 0x6 (6)
>>       number of logical processors at level = 0x40 (64)
>
> FAIL:                                           ^^^^^
>
> While that field is not meant for topology evaluation it is at least
> expected to tell the actual number of logical processors at that level
> which are actually available. 
>
> The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> 36 logical CPUs in package 0 and 36 in package 1 according to your
> table.
>
> On real hardware this looks like this:
>
>       --- level 1 ---
>       level number                          = 0x1 (1)
>       level type                            = core (2)
>       bit width of level                    = 0x6 (6)
>       number of logical processors at level = 0x38 (56)
>
> Which corresponds to reality and is consistent. But sure, consistency is
> overrated.

So I looked really hard to find some hint how to detect this situation
on the boot CPU, which allows us to mitigate it, but there is none at
all.

So we are caught between a rock and a hard place, which provides us two
mutually exclusive options to chose from:

  1) Have a sane topology evaluation mechanism which solves the known
     problems of hybrid systems, wrong sizing estimates and other
     unpleasantries.

  2) Support the Hyper-V BIOS trainwreck forever.

Unsurprisingly #2 is not really an option as #1 is a crucial issue for
the kernel and we need it resolved urgently as of yesterday.

So while I'm definitely a strong supporter of no-regression policy, I
have to make an argument here why this particular issue is _not_
covered:

 1) Hyper-V BIOS/firmware violates the firmware specification and
    requirements which are clearly spelled out in the SDM.

 2) This violatation is reported on every boot with one promiment
    message per brought up AP where the initial APIC ID as provided by
    CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
    also provided by MADT starting with CPU 36 in the provided example:

    "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"

    repeating itself up to CPU71 with the relevant diverging APIC IDs.

    At least that's what the upstream kernel produces according to
    validate_apic_and_package_id() in such an situation.

 3) This is known for years and the Hyper-V Linux team tried to get this
    resolved, but obviously their arguments fell on deaf ears.

    IOW, the firmware BUG message has been ignored willfully for years
    due to "works for me, why should I care?" attitude.

Seriously, kernel development cannot be held hostage forever by the
wilful ignorance of a BIOS team, which refuses to adhere to
specifications and defines their own world order.

The x86 maintainer team is chosing the lesser of two evils and lets
those who created the problem and refused to resolve it deal with the
outcome.

Just to clarify. This is not preventing affected guests from booting.
The worst consequence is a slight performance regression because the
firmware provided topology information is not matching reality and
therefore the scheduler placement vs. L3 affinity sucks. That's clearly
not a kernel problem.

I'm happy to aid accelerating this thought process by elevating the
existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
 
 	apicid = apic->cpu_present_to_apicid(cpu);
 
-	if (apicid != c->topo.apicid) {
+	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
 		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
 		       cpu, apicid, c->topo.initial_apicid);
 	}
Andrew Cooper Aug. 1, 2023, 10:35 p.m. UTC | #12
On 01/08/2023 11:25 pm, Thomas Gleixner wrote:
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
>  
>  	apicid = apic->cpu_present_to_apicid(cpu);
>  
> -	if (apicid != c->topo.apicid) {
> +	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
>  		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",

While you're fixing this, care to remove the chaotic-evil path of mixing
%u and %x with no 0x prefix?

~Andrew
Michael Kelley Aug. 2, 2023, 2:43 p.m. UTC | #13
From: Thomas Gleixner <tglx@linutronix.de> Sent: Tuesday, August 1, 2023 3:25 PM
> 
> Michael!
> 
> On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> > On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> > Clearly the hyper-v BIOS people put a lot of thoughts into this:
> >
> >>    x2APIC features / processor topology (0xb):
> >>       extended APIC ID                      = 0
> >>       --- level 0 ---
> >>       level number                          = 0x0 (0)
> >>       level type                            = thread (1)
> >>       bit width of level                    = 0x1 (1)
> >>       number of logical processors at level = 0x2 (2)
> >>       --- level 1 ---
> >>       level number                          = 0x1 (1)
> >>       level type                            = core (2)
> >>       bit width of level                    = 0x6 (6)
> >>       number of logical processors at level = 0x40 (64)
> >
> > FAIL:                                           ^^^^^
> >
> > While that field is not meant for topology evaluation it is at least
> > expected to tell the actual number of logical processors at that level
> > which are actually available.
> >
> > The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> > 36 logical CPUs in package 0 and 36 in package 1 according to your
> > table.
> >
> > On real hardware this looks like this:
> >
> >       --- level 1 ---
> >       level number                          = 0x1 (1)
> >       level type                            = core (2)
> >       bit width of level                    = 0x6 (6)
> >       number of logical processors at level = 0x38 (56)
> >
> > Which corresponds to reality and is consistent. But sure, consistency is
> > overrated.
> 
> So I looked really hard to find some hint how to detect this situation
> on the boot CPU, which allows us to mitigate it, but there is none at
> all.
> 
> So we are caught between a rock and a hard place, which provides us two
> mutually exclusive options to chose from:
> 
>   1) Have a sane topology evaluation mechanism which solves the known
>      problems of hybrid systems, wrong sizing estimates and other
>      unpleasantries.
> 
>   2) Support the Hyper-V BIOS trainwreck forever.
> 
> Unsurprisingly #2 is not really an option as #1 is a crucial issue for
> the kernel and we need it resolved urgently as of yesterday.
> 
> So while I'm definitely a strong supporter of no-regression policy, I
> have to make an argument here why this particular issue is _not_
> covered:
> 
>  1) Hyper-V BIOS/firmware violates the firmware specification and
>     requirements which are clearly spelled out in the SDM.
> 
>  2) This violatation is reported on every boot with one promiment
>     message per brought up AP where the initial APIC ID as provided by
>     CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
>     also provided by MADT starting with CPU 36 in the provided example:
> 
>     "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"
> 
>     repeating itself up to CPU71 with the relevant diverging APIC IDs.
> 
>     At least that's what the upstream kernel produces according to
>     validate_apic_and_package_id() in such an situation.
> 
>  3) This is known for years and the Hyper-V Linux team tried to get this
>     resolved, but obviously their arguments fell on deaf ears.
> 
>     IOW, the firmware BUG message has been ignored willfully for years
>     due to "works for me, why should I care?" attitude.
> 
> Seriously, kernel development cannot be held hostage forever by the
> wilful ignorance of a BIOS team, which refuses to adhere to
> specifications and defines their own world order.
> 
> The x86 maintainer team is chosing the lesser of two evils and lets
> those who created the problem and refused to resolve it deal with the
> outcome.

Fair enough.  I don't have any basis to argue otherwise.  I'm in
discussions with the Hyper-V team about getting it fully fixed in
Hyper-V, and it looks like there's some movement to make it happen.

> 
> Just to clarify. This is not preventing affected guests from booting.
> The worst consequence is a slight performance regression because the
> firmware provided topology information is not matching reality and
> therefore the scheduler placement vs. L3 affinity sucks. That's clearly
> not a kernel problem.

Yes, if Linux will still boots and runs, that helps.  Then it really is up the
(virtual) firmware in Hyper-V to provide the correct topology information
so performance is as expected.

> 
> I'm happy to aid accelerating this thought process by elevating the
> existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Your choice.  In this particular case, it won't make a difference either
way.

Michael

> 
> Thanks,
> 
>         tglx
> ---
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
> 
>  	apicid = apic->cpu_present_to_apicid(cpu);
> 
> -	if (apicid != c->topo.apicid) {
> +	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
>  		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
>  		       cpu, apicid, c->topo.initial_apicid);
>  	}