Message ID | 20250218190822.1039982-1-superm1@kernel.org |
---|---|
Headers | show |
Series | Add support for AMD hardware feedback interface | expand |
On Tue, 18 Feb 2025, Mario Limonciello wrote: > From: Perry Yuan <Perry.Yuan@amd.com> > > Introduce a new documentation file, `amd_hfi.rst`, which delves into the > implementation details of the AMD Hardware Feedback Interface and its > associated driver, `amd_hfi`. This documentation describes how the > driver provides hint to the OS scheduling which depends on the capability > of core performance and efficiency ranking data. > > This documentation describes > * The design of the driver > * How the driver provides hints to the OS scheduling > * How the driver interfaces with the kernel for efficiency ranking data. > > Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> > Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > Documentation/arch/x86/amd-hfi.rst | 127 +++++++++++++++++++++++++++++ > Documentation/arch/x86/index.rst | 1 + > 2 files changed, 128 insertions(+) > create mode 100644 Documentation/arch/x86/amd-hfi.rst > > diff --git a/Documentation/arch/x86/amd-hfi.rst b/Documentation/arch/x86/amd-hfi.rst > new file mode 100644 > index 0000000000000..5d204688470e3 > --- /dev/null > +++ b/Documentation/arch/x86/amd-hfi.rst > @@ -0,0 +1,127 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +====================================================================== > +Hardware Feedback Interface For Hetero Core Scheduling On AMD Platform > +====================================================================== > + > +:Copyright: 2024 Advanced Micro Devices, Inc. All Rights Reserved. > + > +:Author: Perry Yuan <perry.yuan@amd.com> > +:Author: Mario Limonciello <mario.limonciello@amd.com> > + > +Overview > +-------- > + > +AMD Heterogeneous Core implementations are comprised of more than one > +architectural class and CPUs are comprised of cores of various efficiency and > +power capabilities: performance-oriented *classic cores* and power-efficient > +*dense cores*. As such, power management strategies must be designed to > +accommodate the complexities introduced by incorporating different core types. > +Heterogeneous systems can also extend to more than two architectural classes as > +well. The purpose of the scheduling feedback mechanism is to provide > +information to the operating system scheduler in real time such that the > +scheduler can direct threads to the optimal core. > + > +The goal of AMD's heterogeneous architecture is to attain power benefit by sending > +background thread to the dense cores while sending high priority threads to the classic > +cores. From a performance perspective, sending background threads to dense cores can free > +up power headroom and allow the classic cores to optimally service demanding threads. > +Furthermore, the area optimized nature of the dense cores allows for an increasing > +number of physical cores. This improved core density will have positive multithreaded > +performance impact. Hi Mario, Please fold these paragraphs to 80 characters so that they're easier to read as textfiles (the table can obviously exceed that but there should be no reason for the text paragraphs to have excessively long lines). My apologies for taking so long to get to review this series. Most of my comments are quite minor but there's also 1-2 things that seem more important. It seemed to me that there is some disconnetion between the promises made in the Kconfig description and what is provided by the patch series. -- i. > + > +AMD Heterogeneous Core Driver > +----------------------------- > + > +The ``amd_hfi`` driver delivers the operating system a performance and energy efficiency > +capability data for each CPU in the system. The scheduler can use the ranking data > +from the HFI driver to make task placement decisions. > + > +Thread Classification and Ranking Table Interaction > +---------------------------------------------------- > + > +The thread classification is used to select into a ranking table that describes > +an efficiency and performance ranking for each classification. > + > +Threads are classified during runtime into enumerated classes. The classes represent > +thread performance/power characteristics that may benefit from special scheduling behaviors. > +The below table depicts an example of thread classification and a preference where a given thread > +should be scheduled based on its thread class. The real time thread classification is consumed > +by the operating system and is used to inform the scheduler of where the thread should be placed. > + > +Thread Classification Example Table > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > ++----------+----------------+-------------------------------+---------------------+---------+ > +| class ID | Classification | Preferred scheduling behavior | Preemption priority | Counter | > ++----------+----------------+-------------------------------+---------------------+---------+ > +| 0 | Default | Performant | Highest | | > ++----------+----------------+-------------------------------+---------------------+---------+ > +| 1 | Non-scalable | Efficient | Lowest | PMCx1A1 | > ++----------+----------------+-------------------------------+---------------------+---------+ > +| 2 | I/O bound | Efficient | Lowest | PMCx044 | > ++----------+----------------+-------------------------------+---------------------+---------+ > + > +Thread classification is performed by the hardware each time that the thread is switched out. > +Threads that don't meet any hardware specified criteria will be classified as "default". > + > +AMD Hardware Feedback Interface > +-------------------------------- > + > +The Hardware Feedback Interface provides to the operating system information > +about the performance and energy efficiency of each CPU in the system. Each > +capability is given as a unit-less quantity in the range [0-255]. A higher > +performance value indicates higher performance capability, and a higher > +efficiency value indicates more efficiency. Energy efficiency and performance > +are reported in separate capabilities in the shared memory based ranking table. > + > +These capabilities may change at runtime as a result of changes in the > +operating conditions of the system or the action of external factors. > +Power Management FW is responsible for detecting events that would require > +a reordering of the performance and efficiency ranking. Table updates would > +happen relatively infrequently and occur on the time scale of seconds or more. > + > +The following events trigger a table update: > + * Thermal Stress Events > + * Silent Compute > + * Extreme Low Battery Scenarios > + > +The kernel or a userspace policy daemon can use these capabilities to modify > +task placement decisions. For instance, if either the performance or energy > +capabilities of a given logical processor becomes zero, it is an indication that > +the hardware recommends to the operating system to not schedule any tasks on > +that processor for performance or energy efficiency reasons, respectively. > + > +Implementation details for Linux > +-------------------------------- > + > +The implementation of threads scheduling consists of the following steps: > + > +1. A thread is spawned and scheduled to the ideal core using the default > + heterogeneous scheduling policy. > +2. The processor profiles thread execution and assigns an enumerated classification ID. > + This classification is communicated to the OS via logical processor scope MSR. > +3. During the thread context switch out the operating system consumes the workload(WL) > + classification which resides in a logical processor scope MSR. > +4. The OS triggers the hardware to clear its history by writing to an MSR, > + after consuming the WL classification and before switching in the new thread. > +5. If due to the classification, ranking table, and processor availability, > + the thread is not on its ideal processor, the OS will then consider scheduling > + the thread on its ideal processor (if available). > + > +Ranking Table > +------------- > +The ranking table is a shared memory region that is used to communicate the > +performance and energy efficiency capabilities of each CPU in the system. > + > +The ranking table design includes rankings for each APIC ID in the system and > +rankings both for performance and efficiency for each workload classification. > + > +.. kernel-doc:: drivers/platform/x86/amd/hfi/hfi.c > + :doc: amd_shmem_info > + > +Ranking Table update > +--------------------------- > +The power management firmware issues an platform interrupt after updating the ranking > +table and is ready for the operating system to consume it. CPUs receive such interrupt > +and read new ranking table from shared memory which PCCT table has provided, then > +``amd_hfi`` driver parse the new table to provide new consume data for scheduling decisions. > diff --git a/Documentation/arch/x86/index.rst b/Documentation/arch/x86/index.rst > index 8ac64d7de4dc9..56f2923f52597 100644 > --- a/Documentation/arch/x86/index.rst > +++ b/Documentation/arch/x86/index.rst > @@ -43,3 +43,4 @@ x86-specific Documentation > features > elf_auxvec > xstate > + amd-hfi >
On Tue, 18 Feb 2025, Mario Limonciello wrote: > From: Perry Yuan <Perry.Yuan@amd.com> > > When `amd_hfi` driver is loaded, it will use PCCT subspace type 4 table > to retrieve the shared memory address which contains the CPU core ranking > table. This table includes a header that specifies the number of ranking > data entries to be parsed and rank each CPU core with the Performance and > Energy Efficiency capability as implemented by the CPU power management > firmware. > > Once the table has been parsed, each CPU is assigned a ranking score > within its class. Subsequently, when the scheduler selects cores, it > chooses from the ranking list based on the assigned scores in each class, > thereby ensuring the optimal selection of CPU cores according to their > predefined classifications and priorities. > > Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> > Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > v3: > * Reverse xmas tree order in amd_hfi_fill_metadata() > * s,for_each_present_cpu,for_each_possible_cpu, > v2: > * Drop __packed > --- > drivers/platform/x86/amd/hfi/hfi.c | 194 +++++++++++++++++++++++++++++ > 1 file changed, 194 insertions(+) > > diff --git a/drivers/platform/x86/amd/hfi/hfi.c b/drivers/platform/x86/amd/hfi/hfi.c > index 426f7e520b76c..7ab7ae0ec72ca 100644 > --- a/drivers/platform/x86/amd/hfi/hfi.c > +++ b/drivers/platform/x86/amd/hfi/hfi.c > @@ -18,22 +18,72 @@ > #include <linux/io.h> > #include <linux/kernel.h> > #include <linux/module.h> > +#include <linux/mailbox_client.h> > #include <linux/mutex.h> > +#include <linux/percpu-defs.h> > #include <linux/platform_device.h> > #include <linux/smp.h> > +#include <linux/topology.h> > +#include <linux/workqueue.h> > + > +#include <asm/cpu_device_id.h> > + > +#include <acpi/pcc.h> > +#include <acpi/cppc_acpi.h> > > #define AMD_HFI_DRIVER "amd_hfi" > +#define AMD_HFI_MAILBOX_COUNT 1 > +#define AMD_HETERO_RANKING_TABLE_VER 2 > > #define AMD_HETERO_CPUID_27 0x80000027 > > static struct platform_device *device; > > +/** > + * struct amd_shmem_info - Shared memory table for AMD HFI > + * > + * @header: The PCCT table header including signature, length flags and command. > + * @version_number: Version number of the table > + * @n_logical_processors: Number of logical processors > + * @n_capabilities: Number of ranking dimensions (performance, efficiency, etc) > + * @table_update_context: Command being sent over the subspace > + * @n_bitmaps: Number of 32-bit bitmaps to enumerate all the APIC IDs > + * This is based on the maximum APIC ID enumerated in the system > + * @reserved: 24 bit spare > + * @table_data: Bit Map(s) of enabled logical processors > + * Followed by the ranking data for each logical processor > + */ > +struct amd_shmem_info { > + struct acpi_pcct_ext_pcc_shared_memory header; > + u32 version_number :8, > + n_logical_processors :8, > + n_capabilities :8, > + table_update_context :8; > + u32 n_bitmaps :8, > + reserved :24; > + u32 table_data[]; > +}; > + > struct amd_hfi_data { > const char *name; > struct device *dev; > struct mutex lock; > + > + /* PCCT table related*/ Missing space. > + struct pcc_mbox_chan *pcc_chan; > + void __iomem *pcc_comm_addr; > + struct acpi_subtable_header *pcct_entry; > + struct amd_shmem_info *shmem; > }; > > +/** > + * struct amd_hfi_classes - HFI class capabilities per CPU > + * @perf: Performance capability > + * @eff: Power efficiency capability > + * > + * Capabilities of a logical processor in the ranking table. These capabilities > + * are unitless and specific to each HFI class. > + */ > struct amd_hfi_classes { > u32 perf; > u32 eff; > @@ -42,21 +92,103 @@ struct amd_hfi_classes { > /** > * struct amd_hfi_cpuinfo - HFI workload class info per CPU > * @cpu: cpu index > + * @apic_id: apic id of the current cpu > * @class_index: workload class ID index > * @nr_class: max number of workload class supported > + * @ipcc_scores: ipcc scores for each class > * @amd_hfi_classes: current cpu workload class ranking data > * > * Parameters of a logical processor linked with hardware feedback class > */ > struct amd_hfi_cpuinfo { > int cpu; > + u32 apic_id; > s16 class_index; > u8 nr_class; > + int *ipcc_scores; > struct amd_hfi_classes *amd_hfi_classes; > }; > > static DEFINE_PER_CPU(struct amd_hfi_cpuinfo, amd_hfi_cpuinfo) = {.class_index = -1}; > > +static int find_cpu_index_by_apicid(unsigned int target_apicid) > +{ > + int cpu_index; > + > + for_each_possible_cpu(cpu_index) { > + struct cpuinfo_x86 *info = &cpu_data(cpu_index); > + > + if (info->topo.apicid == target_apicid) { > + pr_debug("match APIC id %d for CPU index: %d\n", > + info->topo.apicid, cpu_index); apicid is unsigned. > + return cpu_index; > + } > + } > + > + return -ENODEV; > +} > + > +static int amd_hfi_fill_metadata(struct amd_hfi_data *amd_hfi_data) > +{ > + struct acpi_pcct_ext_pcc_slave *pcct_ext = > + (struct acpi_pcct_ext_pcc_slave *)amd_hfi_data->pcct_entry; > + void __iomem *pcc_comm_addr; > + > + pcc_comm_addr = acpi_os_ioremap(amd_hfi_data->pcc_chan->shmem_base_addr, > + amd_hfi_data->pcc_chan->shmem_size); > + if (!pcc_comm_addr) { > + pr_err("failed to ioremap PCC common region mem\n"); Don't you have ->dev available in amd_hfi_data so you could use dev_*() for all prints? > + return -ENOMEM; > + } > + > + memcpy_fromio(amd_hfi_data->shmem, pcc_comm_addr, pcct_ext->length); > + iounmap(pcc_comm_addr); > + > + if (amd_hfi_data->shmem->header.signature != PCC_SIGNATURE) { > + pr_err("invalid signature in shared memory\n"); > + return -EINVAL; > + } > + if (amd_hfi_data->shmem->version_number != AMD_HETERO_RANKING_TABLE_VER) { > + pr_err("invalid version %d\n", amd_hfi_data->shmem->version_number); > + return -EINVAL; > + } > + > + for (unsigned int i = 0; i < amd_hfi_data->shmem->n_bitmaps; i++) { > + u32 bitmap = amd_hfi_data->shmem->table_data[i]; > + > + for (unsigned int j = 0; j < BITS_PER_TYPE(u32); j++) { > + int apic_id = i * BITS_PER_TYPE(u32) + j; Why is this signed? If you change to unsigned int, remember to adjust the print line too. > + struct amd_hfi_cpuinfo *info; > + int cpu_index; > + > + if (!(bitmap & BIT(j))) > + continue; > + > + cpu_index = find_cpu_index_by_apicid(apic_id); > + if (cpu_index < 0) { > + pr_warn("APIC ID %d not found\n", apic_id); > + continue; > + } > + > + info = per_cpu_ptr(&amd_hfi_cpuinfo, cpu_index); > + info->apic_id = apic_id; > + > + /* Fill the ranking data for each logical processor */ > + info = per_cpu_ptr(&amd_hfi_cpuinfo, cpu_index); > + for (unsigned int k = 0; k < info->nr_class; k++) { > + u32 *table = amd_hfi_data->shmem->table_data + > + amd_hfi_data->shmem->n_bitmaps + > + i * info->nr_class; > + > + info->amd_hfi_classes[k].eff = table[apic_id + 2 * k]; > + info->amd_hfi_classes[k].perf = table[apic_id + 2 * k + 1]; > + } > + } > + } > + > + return 0; > +} > + > static int amd_hfi_alloc_class_data(struct platform_device *pdev) > { > struct amd_hfi_cpuinfo *hfi_cpuinfo; > @@ -73,6 +205,7 @@ static int amd_hfi_alloc_class_data(struct platform_device *pdev) > > for_each_possible_cpu(idx) { > struct amd_hfi_classes *classes; > + int *ipcc_scores; > > classes = devm_kcalloc(dev, > nr_class_id, > @@ -80,14 +213,71 @@ static int amd_hfi_alloc_class_data(struct platform_device *pdev) > GFP_KERNEL); > if (!classes) > return -ENOMEM; > + ipcc_scores = devm_kcalloc(dev, nr_class_id, sizeof(int), GFP_KERNEL); > + if (!ipcc_scores) > + return -ENOMEM; > hfi_cpuinfo = per_cpu_ptr(&amd_hfi_cpuinfo, idx); > hfi_cpuinfo->amd_hfi_classes = classes; > + hfi_cpuinfo->ipcc_scores = ipcc_scores; > hfi_cpuinfo->nr_class = nr_class_id; > } > > return 0; > } > > +static int amd_hfi_metadata_parser(struct platform_device *pdev, > + struct amd_hfi_data *amd_hfi_data) > +{ > + struct acpi_pcct_ext_pcc_slave *pcct_ext; > + struct acpi_subtable_header *pcct_entry; > + struct mbox_chan *pcc_mbox_channels; > + struct acpi_table_header *pcct_tbl; > + struct pcc_mbox_chan *pcc_chan; > + acpi_status status; > + int ret; > + > + pcc_mbox_channels = devm_kcalloc(&pdev->dev, AMD_HFI_MAILBOX_COUNT, > + sizeof(*pcc_mbox_channels), GFP_KERNEL); > + if (!pcc_mbox_channels) > + return -ENOMEM; > + > + pcc_chan = devm_kcalloc(&pdev->dev, AMD_HFI_MAILBOX_COUNT, > + sizeof(*pcc_chan), GFP_KERNEL); > + if (!pcc_chan) > + return -ENOMEM; > + > + status = acpi_get_table(ACPI_SIG_PCCT, 0, &pcct_tbl); > + if (ACPI_FAILURE(status) || !pcct_tbl) > + return -ENODEV; > + > + /* get pointer to the first PCC subspace entry */ > + pcct_entry = (struct acpi_subtable_header *) ( > + (unsigned long)pcct_tbl + sizeof(struct acpi_table_pcct)); > + > + pcc_chan->mchan = &pcc_mbox_channels[0]; > + > + amd_hfi_data->pcc_chan = pcc_chan; > + amd_hfi_data->pcct_entry = pcct_entry; > + pcct_ext = (struct acpi_pcct_ext_pcc_slave *)pcct_entry; > + > + if (pcct_ext->length <= 0) > + return -EINVAL; > + > + amd_hfi_data->shmem = devm_kzalloc(amd_hfi_data->dev, pcct_ext->length, GFP_KERNEL); > + if (!amd_hfi_data->shmem) > + return -ENOMEM; > + > + pcc_chan->shmem_base_addr = pcct_ext->base_address; > + pcc_chan->shmem_size = pcct_ext->length; > + > + /* parse the shared memory info from the pcct table */ PCCT > + ret = amd_hfi_fill_metadata(amd_hfi_data); > + > + acpi_put_table(pcct_tbl); > + > + return ret; > +} > + > static const struct acpi_device_id amd_hfi_platform_match[] = { > {"AMDI0104", 0}, > { } > @@ -116,6 +306,10 @@ static int amd_hfi_probe(struct platform_device *pdev) > if (ret) > return ret; > > + ret = amd_hfi_metadata_parser(pdev, amd_hfi_data); > + if (ret) > + return ret; > + > return 0; > } > >
On Tue, 18 Feb 2025, Mario Limonciello wrote: > From: Perry Yuan <Perry.Yuan@amd.com> > > There are some firmware parameters that need to be configured > when a CPU core is brought online or offline. > > when CPU is online, it will initialize the workload classification > parameters to CPU firmware which will trigger the workload class ID > updating function. > > Once the CPU is going to offline, it will need to disable the workload > classification function and clear the history. > > Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> > Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > v8: > * Move cpus member to this patch > * Add comment about online > v7: > * move mutex to this patch > --- > drivers/platform/x86/amd/hfi/hfi.c | 87 ++++++++++++++++++++++++++++++ > 1 file changed, 87 insertions(+) > > diff --git a/drivers/platform/x86/amd/hfi/hfi.c b/drivers/platform/x86/amd/hfi/hfi.c > index e1550f4463275..90b57175ccd97 100644 > --- a/drivers/platform/x86/amd/hfi/hfi.c > +++ b/drivers/platform/x86/amd/hfi/hfi.c > @@ -93,6 +93,7 @@ struct amd_hfi_classes { > * struct amd_hfi_cpuinfo - HFI workload class info per CPU > * @cpu: cpu index > * @apic_id: apic id of the current cpu > + * @cpus: mask of cpus associated with amd_hfi_cpuinfo > * @class_index: workload class ID index > * @nr_class: max number of workload class supported > * @ipcc_scores: ipcc scores for each class > @@ -103,6 +104,7 @@ struct amd_hfi_classes { > struct amd_hfi_cpuinfo { > int cpu; > u32 apic_id; > + cpumask_var_t cpus; > s16 class_index; > u8 nr_class; > int *ipcc_scores; > @@ -111,6 +113,8 @@ struct amd_hfi_cpuinfo { > > static DEFINE_PER_CPU(struct amd_hfi_cpuinfo, amd_hfi_cpuinfo) = {.class_index = -1}; > > +static DEFINE_MUTEX(hfi_cpuinfo_lock); Please mention what this protects. > + > static int find_cpu_index_by_apicid(unsigned int target_apicid) > { > int cpu_index; > @@ -234,6 +238,80 @@ static int amd_set_hfi_ipcc_score(struct amd_hfi_cpuinfo *hfi_cpuinfo, int cpu) > return 0; > } > > +static int amd_hfi_set_state(unsigned int cpu, bool state) > +{ > + int ret; > + > + ret = wrmsrl_on_cpu(cpu, AMD_WORKLOAD_CLASS_CONFIG, state); I'd prefer bool -> u64 conversion be done explicitly, eg. ,with ?: operator. > + if (ret) > + return ret; > + > + return wrmsrl_on_cpu(cpu, AMD_WORKLOAD_HRST, 0x1); > +} > + > +/** > + * amd_hfi_online() - Enable workload classification on @cpu > + * @cpu: CPU in which the workload classification will be enabled > + * > + * Return: 0 on success, negative error code on failure > + */ > +static int amd_hfi_online(unsigned int cpu) > +{ > + struct amd_hfi_cpuinfo *hfi_info = per_cpu_ptr(&amd_hfi_cpuinfo, cpu); > + struct amd_hfi_classes *hfi_classes; > + int ret; > + > + if (WARN_ON_ONCE(!hfi_info)) > + return -EINVAL; > + > + /* > + * Check if @cpu as an associated, initialized and ranking data must be filled Please fold to 80 characters. > + */ > + hfi_classes = hfi_info->amd_hfi_classes; > + if (!hfi_classes) > + return -EINVAL; > + > + guard(mutex)(&hfi_cpuinfo_lock); > + > + if (!zalloc_cpumask_var(&hfi_info->cpus, GFP_KERNEL)) > + return -ENOMEM; > + > + cpumask_set_cpu(cpu, hfi_info->cpus); > + > + ret = amd_hfi_set_state(cpu, true); > + if (ret) > + pr_err("WCT enable failed for CPU %d\n", cpu); %u > + > + return ret; > +} > + > +/** > + * amd_hfi_offline() - Disable workload classification on @cpu > + * @cpu: CPU in which the workload classification will be disabled > + * > + * Remove @cpu from those covered by its HFI instance. > + * > + * Return: 0 on success, negative error code on failure > + */ > +static int amd_hfi_offline(unsigned int cpu) > +{ > + struct amd_hfi_cpuinfo *hfi_info = &per_cpu(amd_hfi_cpuinfo, cpu); > + int ret; > + > + if (WARN_ON_ONCE(!hfi_info)) > + return -EINVAL; > + > + guard(mutex)(&hfi_cpuinfo_lock); > + > + ret = amd_hfi_set_state(cpu, false); > + if (ret) > + pr_err("WCT disable failed for CPU %d\n", cpu); %u > + > + free_cpumask_var(hfi_info->cpus); > + > + return ret; > +} > + > static int update_hfi_ipcc_scores(void) > { > int cpu; > @@ -339,6 +417,15 @@ static int amd_hfi_probe(struct platform_device *pdev) > if (ret) > return ret; > > + /* > + * Tasks will already be running at the time this happens. This is > + * OK because rankings will be adjusted by the callbacks. > + */ > + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/amd_hfi:online", > + amd_hfi_online, amd_hfi_offline); > + if (ret < 0) > + return ret; > + > return 0; > } > >
On Tue, 18 Feb 2025, Mario Limonciello wrote: > From: Mario Limonciello <mario.limonciello@amd.com> > > Add a dump of the class and capabilities table to debugfs to assist > with debugging scheduler issues. > > Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > v8: > * s,for_each_present_cpu,for_each_possible_cpu, > v3: > * Move idx to earlier line > --- > drivers/platform/x86/amd/hfi/hfi.c | 35 ++++++++++++++++++++++++++++++ > 1 file changed, 35 insertions(+) > > diff --git a/drivers/platform/x86/amd/hfi/hfi.c b/drivers/platform/x86/amd/hfi/hfi.c > index 14378a0e09e21..79d065d7b6441 100644 > --- a/drivers/platform/x86/amd/hfi/hfi.c > +++ b/drivers/platform/x86/amd/hfi/hfi.c > @@ -13,6 +13,7 @@ > #include <linux/acpi.h> > #include <linux/cpu.h> > #include <linux/cpumask.h> > +#include <linux/debugfs.h> > #include <linux/gfp.h> > #include <linux/init.h> > #include <linux/io.h> > @@ -74,6 +75,8 @@ struct amd_hfi_data { > void __iomem *pcc_comm_addr; > struct acpi_subtable_header *pcct_entry; > struct amd_shmem_info *shmem; > + > + struct dentry *dbgfs_dir; > }; > > /** > @@ -235,6 +238,13 @@ static int amd_hfi_alloc_class_data(struct platform_device *pdev) > return 0; > } > > +static void amd_hfi_remove(struct platform_device *pdev) > +{ > + struct amd_hfi_data *dev = platform_get_drvdata(pdev); > + > + debugfs_remove_recursive(dev->dbgfs_dir); > +} > + > static int amd_set_hfi_ipcc_score(struct amd_hfi_cpuinfo *hfi_cpuinfo, int cpu) > { > for (int i = 0; i < hfi_cpuinfo->nr_class; i++) > @@ -389,6 +399,26 @@ static int amd_hfi_metadata_parser(struct platform_device *pdev, > return ret; > } > > +static int class_capabilities_show(struct seq_file *s, void *unused) > +{ > + int cpu, idx; > + > + seq_puts(s, "CPU #\tWLC\tPerf\tEff\n"); > + for_each_possible_cpu(cpu) { > + struct amd_hfi_cpuinfo *hfi_cpuinfo = per_cpu_ptr(&amd_hfi_cpuinfo, cpu); > + > + seq_printf(s, "%d", cpu); > + for (idx = 0; idx < hfi_cpuinfo->nr_class; idx++) { > + seq_printf(s, "\t%d\t%d\t%d\n", idx, > + hfi_cpuinfo->amd_hfi_classes[idx].perf, > + hfi_cpuinfo->amd_hfi_classes[idx].eff); Please use %u for unsigned variables. > + } > + } > + > + return 0; > +} > +DEFINE_SHOW_ATTRIBUTE(class_capabilities); > + > static int amd_hfi_pm_resume(struct device *dev) > { > int ret, cpu; > @@ -468,6 +498,10 @@ static int amd_hfi_probe(struct platform_device *pdev) > > schedule_work(&sched_amd_hfi_itmt_work); > > + amd_hfi_data->dbgfs_dir = debugfs_create_dir("amd_hfi", arch_debugfs_dir); > + debugfs_create_file("class_capabilities", 0644, amd_hfi_data->dbgfs_dir, pdev, > + &class_capabilities_fops); > + > return 0; > } > > @@ -479,6 +513,7 @@ static struct platform_driver amd_hfi_driver = { > .acpi_match_table = ACPI_PTR(amd_hfi_platform_match), > }, > .probe = amd_hfi_probe, > + .remove = amd_hfi_remove, > }; > > static int __init amd_hfi_init(void) >
On 3/19/2025 09:01, Ilpo Järvinen wrote: > On Tue, 18 Feb 2025, Mario Limonciello wrote: > >> From: Perry Yuan <Perry.Yuan@amd.com> >> >> Introduce a new documentation file, `amd_hfi.rst`, which delves into the >> implementation details of the AMD Hardware Feedback Interface and its >> associated driver, `amd_hfi`. This documentation describes how the >> driver provides hint to the OS scheduling which depends on the capability >> of core performance and efficiency ranking data. >> >> This documentation describes >> * The design of the driver >> * How the driver provides hints to the OS scheduling >> * How the driver interfaces with the kernel for efficiency ranking data. >> >> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> >> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> >> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> Documentation/arch/x86/amd-hfi.rst | 127 +++++++++++++++++++++++++++++ >> Documentation/arch/x86/index.rst | 1 + >> 2 files changed, 128 insertions(+) >> create mode 100644 Documentation/arch/x86/amd-hfi.rst >> >> diff --git a/Documentation/arch/x86/amd-hfi.rst b/Documentation/arch/x86/amd-hfi.rst >> new file mode 100644 >> index 0000000000000..5d204688470e3 >> --- /dev/null >> +++ b/Documentation/arch/x86/amd-hfi.rst >> @@ -0,0 +1,127 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> + >> +====================================================================== >> +Hardware Feedback Interface For Hetero Core Scheduling On AMD Platform >> +====================================================================== >> + >> +:Copyright: 2024 Advanced Micro Devices, Inc. All Rights Reserved. >> + >> +:Author: Perry Yuan <perry.yuan@amd.com> >> +:Author: Mario Limonciello <mario.limonciello@amd.com> >> + >> +Overview >> +-------- >> + >> +AMD Heterogeneous Core implementations are comprised of more than one >> +architectural class and CPUs are comprised of cores of various efficiency and >> +power capabilities: performance-oriented *classic cores* and power-efficient >> +*dense cores*. As such, power management strategies must be designed to >> +accommodate the complexities introduced by incorporating different core types. >> +Heterogeneous systems can also extend to more than two architectural classes as >> +well. The purpose of the scheduling feedback mechanism is to provide >> +information to the operating system scheduler in real time such that the >> +scheduler can direct threads to the optimal core. >> + >> +The goal of AMD's heterogeneous architecture is to attain power benefit by sending >> +background thread to the dense cores while sending high priority threads to the classic >> +cores. From a performance perspective, sending background threads to dense cores can free >> +up power headroom and allow the classic cores to optimally service demanding threads. >> +Furthermore, the area optimized nature of the dense cores allows for an increasing >> +number of physical cores. This improved core density will have positive multithreaded >> +performance impact. > > Hi Mario, > > Please fold these paragraphs to 80 characters so that they're easier to > read as textfiles (the table can obviously exceed that but there should be > no reason for the text paragraphs to have excessively long lines). > > My apologies for taking so long to get to review this series. No problem. Thanks for looking. I'll get a new version ready to put out after the next merge window. > Most of my > comments are quite minor but there's also 1-2 things that seem more > important. It seemed to me that there is some disconnetion between the > promises made in the Kconfig description and what is provided by the patch > series. Some of the series was pared down to go in multiple parts to make it easier to review with follow ups for the dynamic stuff planned for the next iteration. You see some artifacts of that comments and Kconfig. I figured it was better to leave as is for those given they get to the intent, but I can change if you think it's better to adjust them when the next part lands instead. > > -- > i. > >> + >> +AMD Heterogeneous Core Driver >> +----------------------------- >> + >> +The ``amd_hfi`` driver delivers the operating system a performance and energy efficiency >> +capability data for each CPU in the system. The scheduler can use the ranking data >> +from the HFI driver to make task placement decisions. >> + >> +Thread Classification and Ranking Table Interaction >> +---------------------------------------------------- >> + >> +The thread classification is used to select into a ranking table that describes >> +an efficiency and performance ranking for each classification. >> + >> +Threads are classified during runtime into enumerated classes. The classes represent >> +thread performance/power characteristics that may benefit from special scheduling behaviors. >> +The below table depicts an example of thread classification and a preference where a given thread >> +should be scheduled based on its thread class. The real time thread classification is consumed >> +by the operating system and is used to inform the scheduler of where the thread should be placed. >> + >> +Thread Classification Example Table >> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> ++----------+----------------+-------------------------------+---------------------+---------+ >> +| class ID | Classification | Preferred scheduling behavior | Preemption priority | Counter | >> ++----------+----------------+-------------------------------+---------------------+---------+ >> +| 0 | Default | Performant | Highest | | >> ++----------+----------------+-------------------------------+---------------------+---------+ >> +| 1 | Non-scalable | Efficient | Lowest | PMCx1A1 | >> ++----------+----------------+-------------------------------+---------------------+---------+ >> +| 2 | I/O bound | Efficient | Lowest | PMCx044 | >> ++----------+----------------+-------------------------------+---------------------+---------+ >> + >> +Thread classification is performed by the hardware each time that the thread is switched out. >> +Threads that don't meet any hardware specified criteria will be classified as "default". >> + >> +AMD Hardware Feedback Interface >> +-------------------------------- >> + >> +The Hardware Feedback Interface provides to the operating system information >> +about the performance and energy efficiency of each CPU in the system. Each >> +capability is given as a unit-less quantity in the range [0-255]. A higher >> +performance value indicates higher performance capability, and a higher >> +efficiency value indicates more efficiency. Energy efficiency and performance >> +are reported in separate capabilities in the shared memory based ranking table. >> + >> +These capabilities may change at runtime as a result of changes in the >> +operating conditions of the system or the action of external factors. >> +Power Management FW is responsible for detecting events that would require >> +a reordering of the performance and efficiency ranking. Table updates would >> +happen relatively infrequently and occur on the time scale of seconds or more. >> + >> +The following events trigger a table update: >> + * Thermal Stress Events >> + * Silent Compute >> + * Extreme Low Battery Scenarios >> + >> +The kernel or a userspace policy daemon can use these capabilities to modify >> +task placement decisions. For instance, if either the performance or energy >> +capabilities of a given logical processor becomes zero, it is an indication that >> +the hardware recommends to the operating system to not schedule any tasks on >> +that processor for performance or energy efficiency reasons, respectively. >> + >> +Implementation details for Linux >> +-------------------------------- >> + >> +The implementation of threads scheduling consists of the following steps: >> + >> +1. A thread is spawned and scheduled to the ideal core using the default >> + heterogeneous scheduling policy. >> +2. The processor profiles thread execution and assigns an enumerated classification ID. >> + This classification is communicated to the OS via logical processor scope MSR. >> +3. During the thread context switch out the operating system consumes the workload(WL) >> + classification which resides in a logical processor scope MSR. >> +4. The OS triggers the hardware to clear its history by writing to an MSR, >> + after consuming the WL classification and before switching in the new thread. >> +5. If due to the classification, ranking table, and processor availability, >> + the thread is not on its ideal processor, the OS will then consider scheduling >> + the thread on its ideal processor (if available). >> + >> +Ranking Table >> +------------- >> +The ranking table is a shared memory region that is used to communicate the >> +performance and energy efficiency capabilities of each CPU in the system. >> + >> +The ranking table design includes rankings for each APIC ID in the system and >> +rankings both for performance and efficiency for each workload classification. >> + >> +.. kernel-doc:: drivers/platform/x86/amd/hfi/hfi.c >> + :doc: amd_shmem_info >> + >> +Ranking Table update >> +--------------------------- >> +The power management firmware issues an platform interrupt after updating the ranking >> +table and is ready for the operating system to consume it. CPUs receive such interrupt >> +and read new ranking table from shared memory which PCCT table has provided, then >> +``amd_hfi`` driver parse the new table to provide new consume data for scheduling decisions. >> diff --git a/Documentation/arch/x86/index.rst b/Documentation/arch/x86/index.rst >> index 8ac64d7de4dc9..56f2923f52597 100644 >> --- a/Documentation/arch/x86/index.rst >> +++ b/Documentation/arch/x86/index.rst >> @@ -43,3 +43,4 @@ x86-specific Documentation >> features >> elf_auxvec >> xstate >> + amd-hfi >> >
On 2/18/25 19:08, Mario Limonciello wrote: > From: Mario Limonciello <mario.limonciello@amd.com> > > The AMD Heterogeneous core design and Hardware Feedback Interface (HFI) > provide behavioral classification and a dynamically updated ranking table > for the scheduler to use when choosing cores for tasks. > > Threads are classified during runtime into enumerated classes. > Currently, the driver supports 3 classes (0 through 2). These classes > represent thread performance/power characteristics that may benefit from > special scheduling behaviors. The real-time thread classification is > consumed by the operating system and is used to inform the scheduler of > where the thread should be placed for optimal performance or energy efficiency. > > The thread classification helps to select CPU from a ranking table that describes > an efficiency and performance ranking for each classification from two dimensions. Where is that happening in the series? (Using the per-thread classification for task placement.) Am I missing something?
On Thu, 20 Mar 2025, Mario Limonciello wrote: > On 3/19/2025 09:01, Ilpo Järvinen wrote: > > On Tue, 18 Feb 2025, Mario Limonciello wrote: > > > > > From: Perry Yuan <Perry.Yuan@amd.com> > > > > > > Introduce a new documentation file, `amd_hfi.rst`, which delves into the > > > implementation details of the AMD Hardware Feedback Interface and its > > > associated driver, `amd_hfi`. This documentation describes how the > > > driver provides hint to the OS scheduling which depends on the capability > > > of core performance and efficiency ranking data. > > > > > > This documentation describes > > > * The design of the driver > > > * How the driver provides hints to the OS scheduling > > > * How the driver interfaces with the kernel for efficiency ranking data. > > > > > > Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> > > > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> > > > Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > > --- > > > Documentation/arch/x86/amd-hfi.rst | 127 +++++++++++++++++++++++++++++ > > > Documentation/arch/x86/index.rst | 1 + > > > 2 files changed, 128 insertions(+) > > > create mode 100644 Documentation/arch/x86/amd-hfi.rst > > > > > > diff --git a/Documentation/arch/x86/amd-hfi.rst > > > b/Documentation/arch/x86/amd-hfi.rst > > > new file mode 100644 > > > index 0000000000000..5d204688470e3 > > > --- /dev/null > > > +++ b/Documentation/arch/x86/amd-hfi.rst > > > @@ -0,0 +1,127 @@ > > > +.. SPDX-License-Identifier: GPL-2.0 > > > + > > > +====================================================================== > > > +Hardware Feedback Interface For Hetero Core Scheduling On AMD Platform > > > +====================================================================== > > > + > > > +:Copyright: 2024 Advanced Micro Devices, Inc. All Rights Reserved. > > > + > > > +:Author: Perry Yuan <perry.yuan@amd.com> > > > +:Author: Mario Limonciello <mario.limonciello@amd.com> > > > + > > > +Overview > > > +-------- > > > + > > > +AMD Heterogeneous Core implementations are comprised of more than one > > > +architectural class and CPUs are comprised of cores of various efficiency > > > and > > > +power capabilities: performance-oriented *classic cores* and > > > power-efficient > > > +*dense cores*. As such, power management strategies must be designed to > > > +accommodate the complexities introduced by incorporating different core > > > types. > > > +Heterogeneous systems can also extend to more than two architectural > > > classes as > > > +well. The purpose of the scheduling feedback mechanism is to provide > > > +information to the operating system scheduler in real time such that the > > > +scheduler can direct threads to the optimal core. > > > + > > > +The goal of AMD's heterogeneous architecture is to attain power benefit > > > by sending > > > +background thread to the dense cores while sending high priority threads > > > to the classic > > > +cores. From a performance perspective, sending background threads to > > > dense cores can free > > > +up power headroom and allow the classic cores to optimally service > > > demanding threads. > > > +Furthermore, the area optimized nature of the dense cores allows for an > > > increasing > > > +number of physical cores. This improved core density will have positive > > > multithreaded > > > +performance impact. > > > > Hi Mario, > > > > Please fold these paragraphs to 80 characters so that they're easier to > > read as textfiles (the table can obviously exceed that but there should be > > no reason for the text paragraphs to have excessively long lines). > > > > My apologies for taking so long to get to review this series. > > No problem. Thanks for looking. I'll get a new version ready to put out > after the next merge window. > > > Most of my > > comments are quite minor but there's also 1-2 things that seem more > > important. It seemed to me that there is some disconnetion between the > > promises made in the Kconfig description and what is provided by the patch > > series. > > Some of the series was pared down to go in multiple parts to make it easier to > review with follow ups for the dynamic stuff planned for the next iteration. > > You see some artifacts of that comments and Kconfig. I figured it was better > to leave as is for those given they get to the intent, but I can change if you > think it's better to adjust them when the next part lands instead. Okay, I thought that might be because such a split to multiple series. I think you can leave those as is as I assume to intention is to immediately follow up with the other parts (and not like wait a few kernel releases or so)?
On 3/21/2025 08:58, Ilpo Järvinen wrote: > On Thu, 20 Mar 2025, Mario Limonciello wrote: > >> On 3/19/2025 09:01, Ilpo Järvinen wrote: >>> On Tue, 18 Feb 2025, Mario Limonciello wrote: >>> >>>> From: Perry Yuan <Perry.Yuan@amd.com> >>>> >>>> Introduce a new documentation file, `amd_hfi.rst`, which delves into the >>>> implementation details of the AMD Hardware Feedback Interface and its >>>> associated driver, `amd_hfi`. This documentation describes how the >>>> driver provides hint to the OS scheduling which depends on the capability >>>> of core performance and efficiency ranking data. >>>> >>>> This documentation describes >>>> * The design of the driver >>>> * How the driver provides hints to the OS scheduling >>>> * How the driver interfaces with the kernel for efficiency ranking data. >>>> >>>> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> >>>> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> >>>> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> >>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>>> --- >>>> Documentation/arch/x86/amd-hfi.rst | 127 +++++++++++++++++++++++++++++ >>>> Documentation/arch/x86/index.rst | 1 + >>>> 2 files changed, 128 insertions(+) >>>> create mode 100644 Documentation/arch/x86/amd-hfi.rst >>>> >>>> diff --git a/Documentation/arch/x86/amd-hfi.rst >>>> b/Documentation/arch/x86/amd-hfi.rst >>>> new file mode 100644 >>>> index 0000000000000..5d204688470e3 >>>> --- /dev/null >>>> +++ b/Documentation/arch/x86/amd-hfi.rst >>>> @@ -0,0 +1,127 @@ >>>> +.. SPDX-License-Identifier: GPL-2.0 >>>> + >>>> +====================================================================== >>>> +Hardware Feedback Interface For Hetero Core Scheduling On AMD Platform >>>> +====================================================================== >>>> + >>>> +:Copyright: 2024 Advanced Micro Devices, Inc. All Rights Reserved. >>>> + >>>> +:Author: Perry Yuan <perry.yuan@amd.com> >>>> +:Author: Mario Limonciello <mario.limonciello@amd.com> >>>> + >>>> +Overview >>>> +-------- >>>> + >>>> +AMD Heterogeneous Core implementations are comprised of more than one >>>> +architectural class and CPUs are comprised of cores of various efficiency >>>> and >>>> +power capabilities: performance-oriented *classic cores* and >>>> power-efficient >>>> +*dense cores*. As such, power management strategies must be designed to >>>> +accommodate the complexities introduced by incorporating different core >>>> types. >>>> +Heterogeneous systems can also extend to more than two architectural >>>> classes as >>>> +well. The purpose of the scheduling feedback mechanism is to provide >>>> +information to the operating system scheduler in real time such that the >>>> +scheduler can direct threads to the optimal core. >>>> + >>>> +The goal of AMD's heterogeneous architecture is to attain power benefit >>>> by sending >>>> +background thread to the dense cores while sending high priority threads >>>> to the classic >>>> +cores. From a performance perspective, sending background threads to >>>> dense cores can free >>>> +up power headroom and allow the classic cores to optimally service >>>> demanding threads. >>>> +Furthermore, the area optimized nature of the dense cores allows for an >>>> increasing >>>> +number of physical cores. This improved core density will have positive >>>> multithreaded >>>> +performance impact. >>> >>> Hi Mario, >>> >>> Please fold these paragraphs to 80 characters so that they're easier to >>> read as textfiles (the table can obviously exceed that but there should be >>> no reason for the text paragraphs to have excessively long lines). >>> >>> My apologies for taking so long to get to review this series. >> >> No problem. Thanks for looking. I'll get a new version ready to put out >> after the next merge window. >> >>> Most of my >>> comments are quite minor but there's also 1-2 things that seem more >>> important. It seemed to me that there is some disconnetion between the >>> promises made in the Kconfig description and what is provided by the patch >>> series. >> >> Some of the series was pared down to go in multiple parts to make it easier to >> review with follow ups for the dynamic stuff planned for the next iteration. >> >> You see some artifacts of that comments and Kconfig. I figured it was better >> to leave as is for those given they get to the intent, but I can change if you >> think it's better to adjust them when the next part lands instead. > > Okay, I thought that might be because such a split to multiple series. I > think you can leave those as is as I assume to intention is to immediately > follow up with the other parts (and not like wait a few kernel releases > or so)? > The next part was going to be submitted by another team. Let me check offline with them on their intended timing and I will make a call what to do.
From: Mario Limonciello <mario.limonciello@amd.com> The AMD Heterogeneous core design and Hardware Feedback Interface (HFI) provide behavioral classification and a dynamically updated ranking table for the scheduler to use when choosing cores for tasks. Threads are classified during runtime into enumerated classes. Currently, the driver supports 3 classes (0 through 2). These classes represent thread performance/power characteristics that may benefit from special scheduling behaviors. The real-time thread classification is consumed by the operating system and is used to inform the scheduler of where the thread should be placed for optimal performance or energy efficiency. The thread classification helps to select CPU from a ranking table that describes an efficiency and performance ranking for each classification from two dimensions. The ranking data provided by the ranking table are numbers ranging from 0 to 255, where a higher performance value indicates higher performance capability and a higher efficiency value indicates greater efficiency. All the CPU cores are ranked into different class IDs. Within each class ranking, the cores may have different ranking values. Therefore, picking from each classification ID will later allow the scheduler to select the best core while threads are classified into the specified workload class. This series was originally submitted by Perry Yuan [1] but he is now doing a different role and he asked me to take over. Link: https://lore.kernel.org/all/cover.1724748733.git.perry.yuan@amd.com/ On applicable hardware this series has between a 2% and 5% improvement across various benchmarks. There is however a cost associated with clearing history on the process context switch. On average it increases the delay by 119ns, and also has a wider range in delays (the standard deviation is 25% greater). Mario Limonciello (5): MAINTAINERS: Add maintainer entry for AMD Hardware Feedback Driver cpufreq/amd-pstate: Disable preferred cores on designs with workload classification platform/x86/amd: hfi: Set ITMT priority from ranking data platform/x86/amd: hfi: Add debugfs support x86/itmt: Add debugfs file to show core priorities Perry Yuan (8): Documentation: x86: Add AMD Hardware Feedback Interface documentation x86/msr-index: define AMD heterogeneous CPU related MSR platform/x86: hfi: Introduce AMD Hardware Feedback Interface Driver platform/x86: hfi: parse CPU core ranking data from shared memory platform/x86: hfi: init per-cpu scores for each class platform/x86: hfi: add online and offline callback support platform/x86: hfi: add power management callback x86/process: Clear hardware feedback history for AMD processors Documentation/arch/x86/amd-hfi.rst | 127 ++++++ Documentation/arch/x86/index.rst | 1 + MAINTAINERS | 9 + arch/x86/include/asm/msr-index.h | 5 + arch/x86/kernel/itmt.c | 23 ++ arch/x86/kernel/process_64.c | 4 + drivers/cpufreq/amd-pstate.c | 6 + drivers/platform/x86/amd/Kconfig | 1 + drivers/platform/x86/amd/Makefile | 1 + drivers/platform/x86/amd/hfi/Kconfig | 21 + drivers/platform/x86/amd/hfi/Makefile | 7 + drivers/platform/x86/amd/hfi/hfi.c | 550 ++++++++++++++++++++++++++ 12 files changed, 755 insertions(+) create mode 100644 Documentation/arch/x86/amd-hfi.rst create mode 100644 drivers/platform/x86/amd/hfi/Kconfig create mode 100644 drivers/platform/x86/amd/hfi/Makefile create mode 100644 drivers/platform/x86/amd/hfi/hfi.c