mbox series

[v8,00/13] Support PPTT for ARM64

Message ID 20180425233121.13270-1-jeremy.linton@arm.com
Headers show
Series Support PPTT for ARM64 | expand

Message

Jeremy Linton April 25, 2018, 11:31 p.m. UTC
ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to describe the processor and cache topology. Ideally it is
used to extend/override information provided by the hardware, but
right now ARM64 is entirely dependent on firmware provided tables.

This patch parses the table for the cache topology and CPU topology.
When we enable ACPI/PPTT for arm64 we map the package_id to the
PPTT node flagged as the physical package by the firmware.
This results in topologies that match what the remainder of the
system expects. Finally, we update the scheduler MC domain so that
it generally reflects the LLC unless the LLC is too large for the
NUMA domain (or package).

For example on juno:
[root@mammon-juno-rh topology]# lstopo-no-graphics
  Package L#0
    L2 L#0 (1024KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#1 (2048KB)
      L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCIBridge
          PCI 1095:3132
            Block(Disk) L#0 "sda"
        PCIBridge
          PCI 1002:68f9
            GPU L#1 "renderD128"
            GPU L#2 "card0"
            GPU L#3 "controlD64"
        PCIBridge
          PCI 11ab:4380
            Net L#4 "enp8s0"

Git tree at:
http://linux-arm.org/git?p=linux-jlinton.git
branch: pptt_v8

v7->v8:
Modify the logic used to select the MC domain (the change
  shouldn't modify the sched domains on any existing machines
  compared to v7, only how they are built)
Reduce the severity of some parsing messages.
Fix s390 link problem.
Further checks to deal with broken PPTT tables.
Various style tweaks, SPDX license addition, etc.

v6->v7:
Add additional patch to use the last cache level within the NUMA
  or socket as the MC domain. This assures the MC domain is
  equal or smaller than the DIE.
  
Various formatting/etc review comments.

Rebase to 4.16rc2

v5->v6:
Add additional patches which re-factor how the initial DT code sets
  up the cacheinfo structure so that its not as dependent on the
  of_node stored in that tree. Once that is done we rename it
  for use with the ACPI code.

Additionally there were a fair number of minor name/location/etc
  tweaks scattered about made in response to review comments.

v4->v5:
Update the cache type from NOCACHE to UNIFIED when all the cache
  attributes we update are valid. This fixes a problem where caches
  which are entirely created by the PPTT don't show up in lstopo.

Give the PPTT its own firmware_node in the cache structure instead of
  sharing it with the of_node.

Move some pieces around between patches.

(see previous cover letters for futher changes)

Jeremy Linton (13):
  drivers: base: cacheinfo: move cache_setup_of_node()
  drivers: base: cacheinfo: setup DT cache properties early
  cacheinfo: rename of_node to fw_token
  arm64/acpi: Create arch specific cpu to acpi id helper
  ACPI/PPTT: Add Processor Properties Topology Table parsing
  ACPI: Enable PPTT support on ARM64
  drivers: base cacheinfo: Add support for ACPI based firmware tables
  arm64: Add support for ACPI based firmware tables
  ACPI/PPTT: Add topology parsing code
  arm64: topology: rename cluster_id
  arm64: topology: enable ACPI/PPTT based CPU topology
  ACPI: Add PPTT to injectable table list
  arm64: topology: divorce MC scheduling domain from core_siblings

 arch/arm64/Kconfig                |   1 +
 arch/arm64/include/asm/acpi.h     |   4 +
 arch/arm64/include/asm/topology.h |   6 +-
 arch/arm64/kernel/cacheinfo.c     |  15 +-
 arch/arm64/kernel/topology.c      | 103 +++++-
 arch/riscv/kernel/cacheinfo.c     |   1 -
 drivers/acpi/Kconfig              |   3 +
 drivers/acpi/Makefile             |   1 +
 drivers/acpi/pptt.c               | 678 ++++++++++++++++++++++++++++++++++++++
 drivers/acpi/tables.c             |   2 +-
 drivers/base/cacheinfo.c          | 157 ++++-----
 include/linux/acpi.h              |   4 +
 include/linux/cacheinfo.h         |  18 +-
 13 files changed, 886 insertions(+), 107 deletions(-)
 create mode 100644 drivers/acpi/pptt.c

-- 
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ard Biesheuvel April 26, 2018, 7:57 a.m. UTC | #1
On 26 April 2018 at 01:31, Jeremy Linton <jeremy.linton@arm.com> wrote:
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is

> used to describe the processor and cache topology. Ideally it is

> used to extend/override information provided by the hardware, but

> right now ARM64 is entirely dependent on firmware provided tables.

>

> This patch parses the table for the cache topology and CPU topology.

> When we enable ACPI/PPTT for arm64 we map the package_id to the

> PPTT node flagged as the physical package by the firmware.

> This results in topologies that match what the remainder of the

> system expects. Finally, we update the scheduler MC domain so that

> it generally reflects the LLC unless the LLC is too large for the

> NUMA domain (or package).

>


Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> # SynQuacer


Machine (7974MB)
  Package L#0 + L3 L#0 (4096KB)
    L2 L#0 (256KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    L2 L#1 (256KB)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#2 (256KB)
      L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
    L2 L#3 (256KB)
      L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
      L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
    L2 L#4 (256KB)
      L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
      L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
    L2 L#5 (256KB)
      L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
      L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
    L2 L#6 (256KB)
      L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
      L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
    L2 L#7 (256KB)
      L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
      L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
    L2 L#8 (256KB)
      L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
      L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
    L2 L#9 (256KB)
      L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
      L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
    L2 L#10 (256KB)
      L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
      L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
    L2 L#11 (256KB)
      L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
      L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCI 1b21:0612
          Block(Removable Media Device) L#0 "sr0"
          Block(Disk) L#1 "sda"
      PCIBridge
        PCI 168c:0032
          Net L#2 "wlp3s0"
  HostBridge L#4
    PCI 10de:128b
      GPU L#3 "renderD128"
      GPU L#4 "card0"
      GPU L#5 "controlD64"


> For example on juno:

> [root@mammon-juno-rh topology]# lstopo-no-graphics

>   Package L#0

>     L2 L#0 (1024KB)

>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)

>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)

>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)

>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)

>     L2 L#1 (2048KB)

>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)

>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)

>   HostBridge L#0

>     PCIBridge

>       PCIBridge

>         PCIBridge

>           PCI 1095:3132

>             Block(Disk) L#0 "sda"

>         PCIBridge

>           PCI 1002:68f9

>             GPU L#1 "renderD128"

>             GPU L#2 "card0"

>             GPU L#3 "controlD64"

>         PCIBridge

>           PCI 11ab:4380

>             Net L#4 "enp8s0"

>

> Git tree at:

> http://linux-arm.org/git?p=linux-jlinton.git

> branch: pptt_v8

>

> v7->v8:

> Modify the logic used to select the MC domain (the change

>   shouldn't modify the sched domains on any existing machines

>   compared to v7, only how they are built)

> Reduce the severity of some parsing messages.

> Fix s390 link problem.

> Further checks to deal with broken PPTT tables.

> Various style tweaks, SPDX license addition, etc.

>

> v6->v7:

> Add additional patch to use the last cache level within the NUMA

>   or socket as the MC domain. This assures the MC domain is

>   equal or smaller than the DIE.

>

> Various formatting/etc review comments.

>

> Rebase to 4.16rc2

>

> v5->v6:

> Add additional patches which re-factor how the initial DT code sets

>   up the cacheinfo structure so that its not as dependent on the

>   of_node stored in that tree. Once that is done we rename it

>   for use with the ACPI code.

>

> Additionally there were a fair number of minor name/location/etc

>   tweaks scattered about made in response to review comments.

>

> v4->v5:

> Update the cache type from NOCACHE to UNIFIED when all the cache

>   attributes we update are valid. This fixes a problem where caches

>   which are entirely created by the PPTT don't show up in lstopo.

>

> Give the PPTT its own firmware_node in the cache structure instead of

>   sharing it with the of_node.

>

> Move some pieces around between patches.

>

> (see previous cover letters for futher changes)

>

> Jeremy Linton (13):

>   drivers: base: cacheinfo: move cache_setup_of_node()

>   drivers: base: cacheinfo: setup DT cache properties early

>   cacheinfo: rename of_node to fw_token

>   arm64/acpi: Create arch specific cpu to acpi id helper

>   ACPI/PPTT: Add Processor Properties Topology Table parsing

>   ACPI: Enable PPTT support on ARM64

>   drivers: base cacheinfo: Add support for ACPI based firmware tables

>   arm64: Add support for ACPI based firmware tables

>   ACPI/PPTT: Add topology parsing code

>   arm64: topology: rename cluster_id

>   arm64: topology: enable ACPI/PPTT based CPU topology

>   ACPI: Add PPTT to injectable table list

>   arm64: topology: divorce MC scheduling domain from core_siblings

>

>  arch/arm64/Kconfig                |   1 +

>  arch/arm64/include/asm/acpi.h     |   4 +

>  arch/arm64/include/asm/topology.h |   6 +-

>  arch/arm64/kernel/cacheinfo.c     |  15 +-

>  arch/arm64/kernel/topology.c      | 103 +++++-

>  arch/riscv/kernel/cacheinfo.c     |   1 -

>  drivers/acpi/Kconfig              |   3 +

>  drivers/acpi/Makefile             |   1 +

>  drivers/acpi/pptt.c               | 678 ++++++++++++++++++++++++++++++++++++++

>  drivers/acpi/tables.c             |   2 +-

>  drivers/base/cacheinfo.c          | 157 ++++-----

>  include/linux/acpi.h              |   4 +

>  include/linux/cacheinfo.h         |  18 +-

>  13 files changed, 886 insertions(+), 107 deletions(-)

>  create mode 100644 drivers/acpi/pptt.c

>

> --

> 2.13.6

>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla April 26, 2018, 10:27 a.m. UTC | #2
On 26/04/18 00:31, Jeremy Linton wrote:
> Its helpful to be able to lookup the acpi_processor_id associated

> with a logical cpu. Provide an arm64 helper to do this.

> 


As I pointed out in the earlier version, this patch is not required.
The acpi_id stored in the acpi_processor can be used for this.
Won't the below change make it work ? I can't think of any reason why it
shouldn't.

Regards,
Sudeep

-->8

diff --git i/drivers/acpi/pptt.c w/drivers/acpi/pptt.c
index 0fc4b2654665..f421f58b4ae6 100644
--- i/drivers/acpi/pptt.c
+++ w/drivers/acpi/pptt.c
@@ -432,7 +432,7 @@ static void cache_setup_acpi_cpu(struct
acpi_table_header *table,
 {
        struct acpi_pptt_cache *found_cache;
        struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+       u32 acpi_cpu_id = per_cpu(processors, cpu)->acpi_id;
        struct cacheinfo *this_leaf;
        unsigned int index = 0;
        struct acpi_pptt_processor *cpu_node = NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla April 26, 2018, 11:05 a.m. UTC | #3
On 26/04/18 00:31, Jeremy Linton wrote:
> Call ACPI cache parsing routines from base cacheinfo code if ACPI

> is enable. Also stub out cache_setup_acpi() so that individual

> architectures can enable ACPI topology parsing.

> 


[...]

> +#ifndef CONFIG_ACPI

> +static inline int acpi_find_last_cache_level(unsigned int cpu)

> +{

> +	/* ACPI kernels should be built with PPTT support */


This sounds incorrect for x86. But I understand why you have it there.
Does it makes sense to change above to .. ?

#if !defined(CONFIG_ACPI) || (defined(CONFIG_ACPI) && !(CONFIG_ACPI_PPTT))

-- 
Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeremy Linton April 26, 2018, 6:57 p.m. UTC | #4
Hi,

On 04/26/2018 06:05 AM, Sudeep Holla wrote:
> 

> 

> On 26/04/18 00:31, Jeremy Linton wrote:

>> Call ACPI cache parsing routines from base cacheinfo code if ACPI

>> is enable. Also stub out cache_setup_acpi() so that individual

>> architectures can enable ACPI topology parsing.

>>

> 

> [...]

> 

>> +#ifndef CONFIG_ACPI

>> +static inline int acpi_find_last_cache_level(unsigned int cpu)

>> +{

>> +	/* ACPI kernels should be built with PPTT support */

> 

> This sounds incorrect for x86. But I understand why you have it there.

> Does it makes sense to change above to .. ?

> 

> #if !defined(CONFIG_ACPI) || (defined(CONFIG_ACPI) && !(CONFIG_ACPI_PPTT))

> 

I'm not sure what that buys us, if anything you want more non-users of 
the function to be falling through to the function prototype rather than 
the static inline. The only place any of this matters (as long as the 
compiler/linker is tossing the static inline) is arm64 because its the 
only arch making a call to acpi_find_last_cache_level(). ACPI_PPTT is 
also only visible on arm64 at the moment due to being wrapped in a if 
ARM64 in the Kconfig

Put another way, I wouldn't expect an arch to have a 'user' visible 
option to enable/disable parsing the PPTT. If an arch can handle 
ACPI/PPTT topology then I would expect it to be fixed to the CONFIG_ACPI 
state. What happens when acpi_find_last_cache_level() is called should 
only be dependent on whether ACPI is enabled, the PPTT parser itself 
will handle the cases of a missing table.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla April 27, 2018, 12:49 p.m. UTC | #5
On 26/04/18 19:57, Jeremy Linton wrote:
> Hi,

> 

> On 04/26/2018 06:05 AM, Sudeep Holla wrote:

>>

>>

>> On 26/04/18 00:31, Jeremy Linton wrote:

>>> Call ACPI cache parsing routines from base cacheinfo code if ACPI

>>> is enable. Also stub out cache_setup_acpi() so that individual

>>> architectures can enable ACPI topology parsing.

>>>

>>

>> [...]

>>

>>> +#ifndef CONFIG_ACPI

>>> +static inline int acpi_find_last_cache_level(unsigned int cpu)

>>> +{

>>> +    /* ACPI kernels should be built with PPTT support */

>>

>> This sounds incorrect for x86. But I understand why you have it there.

>> Does it makes sense to change above to .. ?

>>

>> #if !defined(CONFIG_ACPI) || (defined(CONFIG_ACPI) &&

>> !(CONFIG_ACPI_PPTT))

>>

> I'm not sure what that buys us, if anything you want more non-users of

> the function to be falling through to the function prototype rather than

> the static inline. The only place any of this matters (as long as the

> compiler/linker is tossing the static inline) is arm64 because its the

> only arch making a call to acpi_find_last_cache_level(). ACPI_PPTT is

> also only visible on arm64 at the moment due to being wrapped in a if

> ARM64 in the Kconfig

> 


Fair enough.

> Put another way, I wouldn't expect an arch to have a 'user' visible

> option to enable/disable parsing the PPTT. If an arch can handle

> ACPI/PPTT topology then I would expect it to be fixed to the CONFIG_ACPI

> state. What happens when acpi_find_last_cache_level() is called should

> only be dependent on whether ACPI is enabled, the PPTT parser itself

> will handle the cases of a missing table.


Agreed. But technically that statement is still incorrect as x86 ACPI
build need not have PPTT enabled. IMO you can reword it, but I will
leave that to Rafael :)

Other than that, it looks good.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>


-- 
Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla April 27, 2018, 1:08 p.m. UTC | #6
On 26/04/18 19:33, Jeremy Linton wrote:
> Hi,

> 

> On 04/26/2018 05:27 AM, Sudeep Holla wrote:

>>

>>

>> On 26/04/18 00:31, Jeremy Linton wrote:

>>> Its helpful to be able to lookup the acpi_processor_id associated

>>> with a logical cpu. Provide an arm64 helper to do this.

>>>

>>

>> As I pointed out in the earlier version, this patch is not required.

>> The acpi_id stored in the acpi_processor can be used for this.

>> Won't the below change make it work ? I can't think of any reason why it

>> shouldn't.

> 

> So, I only noticed your previous email last night on the mail archive,

> as I was applying your review/ack tags and couldn't find a response for

> this patch, seem the spam/etc filters need some further tweaking!

>


Ah that's bad.

> At that point, I was pretty sure the suggestion wasn't going to work out

> of the box as a lot of this code is running fairly early in the boot

> process. I spent a bit of time and plugged the change in to verify that

> assertion, and yes the per_cpu processor/acpi bits aren't setup early

> enough to be used by much of this code. It is being called from

> init_cpu_topology()/smp_prepare_cpus() which precedes

> do_basic_setup/do_initcalls() which is what runs the acpi_init()

> sequence which ends up eventually allocating the required data

> structures. So without restructuring the core boot sequence, this seems

> like a reasonable solution.

> 


OK makes sense. I completely ignored topology related patches as I
haven't looked at them yet and assumed cacheinfo is the only user. Sorry
for that.

-- 
Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeremy Linton April 27, 2018, 4:20 p.m. UTC | #7
Hi,

Thanks for taking a look at this.

On 04/27/2018 06:02 AM, Rafael J. Wysocki wrote:
> On Thu, Apr 26, 2018 at 1:31 AM, Jeremy Linton <jeremy.linton@arm.com> wrote:

>> ACPI 6.2 adds a new table, which describes how processing units

>> are related to each other in tree like fashion. Caches are

>> also sprinkled throughout the tree and describe the properties

>> of the caches in relation to other caches and processing units.

>>

>> Add the code to parse the cache hierarchy and report the total

>> number of levels of cache for a given core using

>> acpi_find_last_cache_level() as well as fill out the individual

>> cores cache information with cache_setup_acpi() once the

>> cpu_cacheinfo structure has been populated by the arch specific

>> code.

>>

>> An additional patch later in the set adds the ability to report

>> peers in the topology using find_acpi_cpu_topology()

>> to report a unique ID for each processing unit at a given level

>> in the tree. These unique id's can then be used to match related

>> processing units which exist as threads, within a given

>> package, etc.

>>

>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

>> Acked-by: Sudeep Holla <sudeep.holla@arm.com>

>> ---

>>   drivers/acpi/pptt.c | 518 ++++++++++++++++++++++++++++++++++++++++++++++++++++

>>   1 file changed, 518 insertions(+)

>>   create mode 100644 drivers/acpi/pptt.c

>>

>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c

>> new file mode 100644

>> index 000000000000..cced71ef851a

>> --- /dev/null

>> +++ b/drivers/acpi/pptt.c

>> @@ -0,0 +1,518 @@

>> +// SPDX-License-Identifier: GPL-2.0

>> +/*

>> + * pptt.c - parsing of Processor Properties Topology Table

>> + *

>> + * Copyright (C) 2018, ARM

>> + *

>> + * This file implements parsing of Processor Properties Topology Table (PPTT)

>> + * which is optionally used to describe the processor and cache topology.

>> + * Due to the relative pointers used throughout the table, this doesn't

>> + * leverage the existing subtable parsing in the kernel.

>> + *

>> + * The PPTT structure is an inverted tree, with each node potentially

>> + * holding one or two inverted tree data structures describing

>> + * the caches available at that level. Each cache structure optionally

>> + * contains properties describing the cache at a given level which can be

>> + * used to override hardware probed values.

>> + */

>> +#define pr_fmt(fmt) "ACPI PPTT: " fmt

>> +

>> +#include <linux/acpi.h>

>> +#include <linux/cacheinfo.h>

>> +#include <acpi/processor.h>

>> +

>> +/**

>> + * fetch_pptt_subtable() - Find/Verify that the PPTT ref is a valid subtable

> 

> The parens above are at least redundant (if not harmful).  Everywhere

> else in a similar context too.


The kerneldoc ones? I guess i'm confused the kernel doc example in 
doc-guide/kernel-doc has

* function_name() - Brief description of function.


> 

> Also kerneldoc comments document function arguments too as a rule, so

> please do that here and wherever you use kerneldoc comments in the

> patchset.


Ok, sure.

> 

>> + *

>> + * Given the PPTT table, find and verify that the subtable entry

>> + * is located within the table

>> + *

>> + * Return: acpi_subtable_header* or NULL

>> + */

>> +static struct acpi_subtable_header *fetch_pptt_subtable(struct acpi_table_header *table_hdr,

>> +                                                       u32 pptt_ref)

>> +{

>> +       struct acpi_subtable_header *entry;

>> +

>> +       /* there isn't a subtable at reference 0 */

>> +       if (pptt_ref < sizeof(struct acpi_subtable_header))

>> +               return NULL;

>> +

>> +       if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)

>> +               return NULL;

>> +

>> +       entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);

>> +

>> +       if (entry->length == 0)

>> +               return NULL;

>> +

>> +       if (pptt_ref + entry->length > table_hdr->length)

>> +               return NULL;

>> +

>> +       return entry;

>> +}

> 

> Apart from the above I'm not entirely sure why you need the changes in

> patch [09/13] to go in a separate patch.  All of them are new code

> going into the file created by this patch, so why not to put them

> here?



Ok, I was doing that because Lorenzo asked for it, but he hasn't said 
much so I will collapse it back together. That makes me happy, as 
splitting chunks between patches is a pain anyway.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki April 30, 2018, 7:59 a.m. UTC | #8
On Friday, April 27, 2018 6:20:44 PM CEST Jeremy Linton wrote:
> Hi,

> 

> Thanks for taking a look at this.

> 

> On 04/27/2018 06:02 AM, Rafael J. Wysocki wrote:

> > On Thu, Apr 26, 2018 at 1:31 AM, Jeremy Linton <jeremy.linton@arm.com> wrote:

> >> ACPI 6.2 adds a new table, which describes how processing units

> >> are related to each other in tree like fashion. Caches are

> >> also sprinkled throughout the tree and describe the properties

> >> of the caches in relation to other caches and processing units.

> >>

> >> Add the code to parse the cache hierarchy and report the total

> >> number of levels of cache for a given core using

> >> acpi_find_last_cache_level() as well as fill out the individual

> >> cores cache information with cache_setup_acpi() once the

> >> cpu_cacheinfo structure has been populated by the arch specific

> >> code.

> >>

> >> An additional patch later in the set adds the ability to report

> >> peers in the topology using find_acpi_cpu_topology()

> >> to report a unique ID for each processing unit at a given level

> >> in the tree. These unique id's can then be used to match related

> >> processing units which exist as threads, within a given

> >> package, etc.

> >>

> >> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

> >> Acked-by: Sudeep Holla <sudeep.holla@arm.com>

> >> ---

> >>   drivers/acpi/pptt.c | 518 ++++++++++++++++++++++++++++++++++++++++++++++++++++

> >>   1 file changed, 518 insertions(+)

> >>   create mode 100644 drivers/acpi/pptt.c

> >>

> >> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c

> >> new file mode 100644

> >> index 000000000000..cced71ef851a

> >> --- /dev/null

> >> +++ b/drivers/acpi/pptt.c

> >> @@ -0,0 +1,518 @@

> >> +// SPDX-License-Identifier: GPL-2.0

> >> +/*

> >> + * pptt.c - parsing of Processor Properties Topology Table

> >> + *

> >> + * Copyright (C) 2018, ARM

> >> + *

> >> + * This file implements parsing of Processor Properties Topology Table (PPTT)

> >> + * which is optionally used to describe the processor and cache topology.

> >> + * Due to the relative pointers used throughout the table, this doesn't

> >> + * leverage the existing subtable parsing in the kernel.

> >> + *

> >> + * The PPTT structure is an inverted tree, with each node potentially

> >> + * holding one or two inverted tree data structures describing

> >> + * the caches available at that level. Each cache structure optionally

> >> + * contains properties describing the cache at a given level which can be

> >> + * used to override hardware probed values.

> >> + */

> >> +#define pr_fmt(fmt) "ACPI PPTT: " fmt

> >> +

> >> +#include <linux/acpi.h>

> >> +#include <linux/cacheinfo.h>

> >> +#include <acpi/processor.h>

> >> +

> >> +/**

> >> + * fetch_pptt_subtable() - Find/Verify that the PPTT ref is a valid subtable

> > 

> > The parens above are at least redundant (if not harmful).  Everywhere

> > else in a similar context too.

> 

> The kerneldoc ones? I guess i'm confused the kernel doc example in 

> doc-guide/kernel-doc has

> 

> * function_name() - Brief description of function.


Well, OK, the parens not harmful, then, but it works without them.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla May 1, 2018, 2:33 p.m. UTC | #9
On 26/04/18 00:31, Jeremy Linton wrote:
> Now that we have an accurate view of the physical topology

> we need to represent it correctly to the scheduler. Generally MC

> should equal the LLC in the system, but there are a number of

> special cases that need to be dealt with.

> 

> In the case of NUMA in socket, we need to assure that the sched

> domain we build for the MC layer isn't larger than the DIE above it.

> Similarly for LLC's that might exist in cross socket interconnect or

> directory hardware we need to assure that MC is shrunk to the socket

> or NUMA node.

> 

> This patch builds a sibling mask for the LLC, and then picks the

> smallest of LLC, socket siblings, or NUMA node siblings, which

> gives us the behavior described above. This is ever so slightly

> different than the similar alternative where we look for a cache

> layer less than or equal to the socket/NUMA siblings.

> 

> The logic to pick the MC layer affects all arm64 machines, but

> only changes the behavior for DT/MPIDR systems if the NUMA domain

> is smaller than the core siblings (generally set to the cluster).

> Potentially this fixes a possible bug in DT systems, but really

> it only affects ACPI systems where the core siblings is correctly

> set to the socket siblings. Thus all currently available ACPI

> systems should have MC equal to LLC, including the NUMA in socket

> machines where the LLC is partitioned between the NUMA nodes.

> 

> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

> ---

>  arch/arm64/include/asm/topology.h |  2 ++

>  arch/arm64/kernel/topology.c      | 32 +++++++++++++++++++++++++++++++-

>  2 files changed, 33 insertions(+), 1 deletion(-)

> 

> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

> index 6b10459e6905..df48212f767b 100644

> --- a/arch/arm64/include/asm/topology.h

> +++ b/arch/arm64/include/asm/topology.h

> @@ -8,8 +8,10 @@ struct cpu_topology {

>  	int thread_id;

>  	int core_id;

>  	int package_id;

> +	int llc_id;

>  	cpumask_t thread_sibling;

>  	cpumask_t core_sibling;

> +	cpumask_t llc_siblings;

>  };

>  

>  extern struct cpu_topology cpu_topology[NR_CPUS];

> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c

> index bd1aae438a31..20b4341dc527 100644

> --- a/arch/arm64/kernel/topology.c

> +++ b/arch/arm64/kernel/topology.c

> @@ -13,6 +13,7 @@

>  

>  #include <linux/acpi.h>

>  #include <linux/arch_topology.h>

> +#include <linux/cacheinfo.h>

>  #include <linux/cpu.h>

>  #include <linux/cpumask.h>

>  #include <linux/init.h>

> @@ -214,7 +215,19 @@ EXPORT_SYMBOL_GPL(cpu_topology);

>  

>  const struct cpumask *cpu_coregroup_mask(int cpu)

>  {

> -	return &cpu_topology[cpu].core_sibling;

> +	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

> +

> +	/* Find the smaller of NUMA, core or LLC siblings */

> +	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {

> +		/* not numa in package, lets use the package siblings */

> +		core_mask = &cpu_topology[cpu].core_sibling;

> +	}

> +	if (cpu_topology[cpu].llc_id != -1) {

> +		if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))

> +			core_mask = &cpu_topology[cpu].llc_siblings;

> +	}

> +

> +	return core_mask;

>  }

>  

>  static void update_siblings_masks(unsigned int cpuid)

> @@ -226,6 +239,9 @@ static void update_siblings_masks(unsigned int cpuid)

>  	for_each_possible_cpu(cpu) {

>  		cpu_topo = &cpu_topology[cpu];

>  

> +		if (cpuid_topo->llc_id == cpu_topo->llc_id)

> +			cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);

> +


Would this not result in cpuid_topo->llc_siblings = cpu_possible_mask
on DT systems where llc_id is not set/defaults to -1 and still pass the
condition. Does it make sense to add additional -1 check ?

>  		if (cpuid_topo->package_id != cpu_topo->package_id)

>  			continue;

>  

> @@ -291,6 +307,10 @@ static void __init reset_cpu_topology(void)

>  		cpu_topo->core_id = 0;

>  		cpu_topo->package_id = -1;

>  

> +		cpu_topo->llc_id = -1;

> +		cpumask_clear(&cpu_topo->llc_siblings);

> +		cpumask_set_cpu(cpu, &cpu_topo->llc_siblings);

> +

>  		cpumask_clear(&cpu_topo->core_sibling);

>  		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);

>  		cpumask_clear(&cpu_topo->thread_sibling);

> @@ -311,6 +331,8 @@ static int __init parse_acpi_topology(void)

>  	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;

>  

>  	for_each_possible_cpu(cpu) {

> +		int i;

> +

>  		topology_id = find_acpi_cpu_topology(cpu, 0);

>  		if (topology_id < 0)

>  			return topology_id;

> @@ -325,6 +347,14 @@ static int __init parse_acpi_topology(void)

>  		}

>  		topology_id = find_acpi_cpu_topology_package(cpu);

>  		cpu_topology[cpu].package_id = topology_id;

> +

> +		i = acpi_find_last_cache_level(cpu);

> +

> +		if (i > 0) {

> +			topology_id = find_acpi_cpu_cache_topology(cpu, i);

> +			if (topology_id > 0)

> +				cpu_topology[cpu].llc_id = topology_id;

> +		}


[nit] s/topology_id/cache_id/ or s/topology_id/cache_topology_id/ ?

Otherwise looks fine to me. You can add with above things fixed.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>


-- 
Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla May 1, 2018, 2:40 p.m. UTC | #10
On 26/04/18 00:31, Jeremy Linton wrote:
> Lets match the name of the arm64 topology field

> to the kernel macro that uses it.


[nit] You can add a note that cluster id is not architectural defined
for ARM platforms just to elaborate on the intention for this change.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>


-- 
Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeremy Linton May 2, 2018, 10:32 p.m. UTC | #11
Hi,

On 05/02/2018 06:49 AM, Morten Rasmussen wrote:
> On Tue, May 01, 2018 at 03:33:33PM +0100, Sudeep Holla wrote:

>>

>>

>> On 26/04/18 00:31, Jeremy Linton wrote:

>>> Now that we have an accurate view of the physical topology

>>> we need to represent it correctly to the scheduler. Generally MC

>>> should equal the LLC in the system, but there are a number of

>>> special cases that need to be dealt with.

>>>

>>> In the case of NUMA in socket, we need to assure that the sched

>>> domain we build for the MC layer isn't larger than the DIE above it.

>>> Similarly for LLC's that might exist in cross socket interconnect or

>>> directory hardware we need to assure that MC is shrunk to the socket

>>> or NUMA node.

>>>

>>> This patch builds a sibling mask for the LLC, and then picks the

>>> smallest of LLC, socket siblings, or NUMA node siblings, which

>>> gives us the behavior described above. This is ever so slightly

>>> different than the similar alternative where we look for a cache

>>> layer less than or equal to the socket/NUMA siblings.

>>>

>>> The logic to pick the MC layer affects all arm64 machines, but

>>> only changes the behavior for DT/MPIDR systems if the NUMA domain

>>> is smaller than the core siblings (generally set to the cluster).

>>> Potentially this fixes a possible bug in DT systems, but really

>>> it only affects ACPI systems where the core siblings is correctly

>>> set to the socket siblings. Thus all currently available ACPI

>>> systems should have MC equal to LLC, including the NUMA in socket

>>> machines where the LLC is partitioned between the NUMA nodes.

>>>

>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

>>> ---

>>>   arch/arm64/include/asm/topology.h |  2 ++

>>>   arch/arm64/kernel/topology.c      | 32 +++++++++++++++++++++++++++++++-

>>>   2 files changed, 33 insertions(+), 1 deletion(-)

>>>

>>> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

>>> index 6b10459e6905..df48212f767b 100644

>>> --- a/arch/arm64/include/asm/topology.h

>>> +++ b/arch/arm64/include/asm/topology.h

>>> @@ -8,8 +8,10 @@ struct cpu_topology {

>>>   	int thread_id;

>>>   	int core_id;

>>>   	int package_id;

>>> +	int llc_id;

>>>   	cpumask_t thread_sibling;

>>>   	cpumask_t core_sibling;

>>> +	cpumask_t llc_siblings;

>>>   };

>>>   

>>>   extern struct cpu_topology cpu_topology[NR_CPUS];

>>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c

>>> index bd1aae438a31..20b4341dc527 100644

>>> --- a/arch/arm64/kernel/topology.c

>>> +++ b/arch/arm64/kernel/topology.c

>>> @@ -13,6 +13,7 @@

>>>   

>>>   #include <linux/acpi.h>

>>>   #include <linux/arch_topology.h>

>>> +#include <linux/cacheinfo.h>

>>>   #include <linux/cpu.h>

>>>   #include <linux/cpumask.h>

>>>   #include <linux/init.h>

>>> @@ -214,7 +215,19 @@ EXPORT_SYMBOL_GPL(cpu_topology);

>>>   

>>>   const struct cpumask *cpu_coregroup_mask(int cpu)

>>>   {

>>> -	return &cpu_topology[cpu].core_sibling;

>>> +	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

>>> +

>>> +	/* Find the smaller of NUMA, core or LLC siblings */

>>> +	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {

>>> +		/* not numa in package, lets use the package siblings */

>>> +		core_mask = &cpu_topology[cpu].core_sibling;

>>> +	}

>>> +	if (cpu_topology[cpu].llc_id != -1) {

>>> +		if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))

>>> +			core_mask = &cpu_topology[cpu].llc_siblings;

>>> +	}

>>> +

>>> +	return core_mask;

>>>   }

>>>   

>>>   static void update_siblings_masks(unsigned int cpuid)

>>> @@ -226,6 +239,9 @@ static void update_siblings_masks(unsigned int cpuid)

>>>   	for_each_possible_cpu(cpu) {

>>>   		cpu_topo = &cpu_topology[cpu];

>>>   

>>> +		if (cpuid_topo->llc_id == cpu_topo->llc_id)

>>> +			cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);

>>> +

>>

>> Would this not result in cpuid_topo->llc_siblings = cpu_possible_mask

>> on DT systems where llc_id is not set/defaults to -1 and still pass the

>> condition. Does it make sense to add additional -1 check ?

> 

> I don't think mask will be used by the current code if llc_id == -1 as

> the user does the check. Is it better to have the mask empty than

> default to cpu_possible_mask? If we require all users to implement a

> check it shouldn't matter.

> 


Right.

There is also the other way of thinking about it, which is if you remove 
the if llc_id == -1 check in cpu_coregroup_mask() does it make more 
sense to have llc_siblings default equal all the cores, or just the one 
being requested?
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeremy Linton May 2, 2018, 10:34 p.m. UTC | #12
Hi,

Thanks for taking a look at this.

On 05/01/2018 09:33 AM, Sudeep Holla wrote:
> 

> 

> On 26/04/18 00:31, Jeremy Linton wrote:

>> Now that we have an accurate view of the physical topology

>> we need to represent it correctly to the scheduler. Generally MC

>> should equal the LLC in the system, but there are a number of

>> special cases that need to be dealt with.

>>

>> In the case of NUMA in socket, we need to assure that the sched

>> domain we build for the MC layer isn't larger than the DIE above it.

>> Similarly for LLC's that might exist in cross socket interconnect or

>> directory hardware we need to assure that MC is shrunk to the socket

>> or NUMA node.

>>

>> This patch builds a sibling mask for the LLC, and then picks the

>> smallest of LLC, socket siblings, or NUMA node siblings, which

>> gives us the behavior described above. This is ever so slightly

>> different than the similar alternative where we look for a cache

>> layer less than or equal to the socket/NUMA siblings.

>>

>> The logic to pick the MC layer affects all arm64 machines, but

>> only changes the behavior for DT/MPIDR systems if the NUMA domain

>> is smaller than the core siblings (generally set to the cluster).

>> Potentially this fixes a possible bug in DT systems, but really

>> it only affects ACPI systems where the core siblings is correctly

>> set to the socket siblings. Thus all currently available ACPI

>> systems should have MC equal to LLC, including the NUMA in socket

>> machines where the LLC is partitioned between the NUMA nodes.

>>

>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

>> ---

>>   arch/arm64/include/asm/topology.h |  2 ++

>>   arch/arm64/kernel/topology.c      | 32 +++++++++++++++++++++++++++++++-

>>   2 files changed, 33 insertions(+), 1 deletion(-)

>>

>> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

>> index 6b10459e6905..df48212f767b 100644

>> --- a/arch/arm64/include/asm/topology.h

>> +++ b/arch/arm64/include/asm/topology.h

>> @@ -8,8 +8,10 @@ struct cpu_topology {

>>   	int thread_id;

>>   	int core_id;

>>   	int package_id;

>> +	int llc_id;

>>   	cpumask_t thread_sibling;

>>   	cpumask_t core_sibling;

>> +	cpumask_t llc_siblings;

>>   };

>>   

>>   extern struct cpu_topology cpu_topology[NR_CPUS];

>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c

>> index bd1aae438a31..20b4341dc527 100644

>> --- a/arch/arm64/kernel/topology.c

>> +++ b/arch/arm64/kernel/topology.c

>> @@ -13,6 +13,7 @@

>>   

>>   #include <linux/acpi.h>

>>   #include <linux/arch_topology.h>

>> +#include <linux/cacheinfo.h>

>>   #include <linux/cpu.h>

>>   #include <linux/cpumask.h>

>>   #include <linux/init.h>

>> @@ -214,7 +215,19 @@ EXPORT_SYMBOL_GPL(cpu_topology);

>>   

>>   const struct cpumask *cpu_coregroup_mask(int cpu)

>>   {

>> -	return &cpu_topology[cpu].core_sibling;

>> +	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

>> +

>> +	/* Find the smaller of NUMA, core or LLC siblings */

>> +	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {

>> +		/* not numa in package, lets use the package siblings */

>> +		core_mask = &cpu_topology[cpu].core_sibling;

>> +	}

>> +	if (cpu_topology[cpu].llc_id != -1) {

>> +		if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))

>> +			core_mask = &cpu_topology[cpu].llc_siblings;

>> +	}

>> +

>> +	return core_mask;

>>   }

>>   

>>   static void update_siblings_masks(unsigned int cpuid)

>> @@ -226,6 +239,9 @@ static void update_siblings_masks(unsigned int cpuid)

>>   	for_each_possible_cpu(cpu) {

>>   		cpu_topo = &cpu_topology[cpu];

>>   

>> +		if (cpuid_topo->llc_id == cpu_topo->llc_id)

>> +			cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);

>> +

> 

> Would this not result in cpuid_topo->llc_siblings = cpu_possible_mask

> on DT systems where llc_id is not set/defaults to -1 and still pass the

> condition. Does it make sense to add additional -1 check ?

(see comment in Morton's thread)

> 

>>   		if (cpuid_topo->package_id != cpu_topo->package_id)

>>   			continue;

>>   

>> @@ -291,6 +307,10 @@ static void __init reset_cpu_topology(void)

>>   		cpu_topo->core_id = 0;

>>   		cpu_topo->package_id = -1;

>>   

>> +		cpu_topo->llc_id = -1;

>> +		cpumask_clear(&cpu_topo->llc_siblings);

>> +		cpumask_set_cpu(cpu, &cpu_topo->llc_siblings);

>> +

>>   		cpumask_clear(&cpu_topo->core_sibling);

>>   		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);

>>   		cpumask_clear(&cpu_topo->thread_sibling);

>> @@ -311,6 +331,8 @@ static int __init parse_acpi_topology(void)

>>   	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;

>>   

>>   	for_each_possible_cpu(cpu) {

>> +		int i;

>> +

>>   		topology_id = find_acpi_cpu_topology(cpu, 0);

>>   		if (topology_id < 0)

>>   			return topology_id;

>> @@ -325,6 +347,14 @@ static int __init parse_acpi_topology(void)

>>   		}

>>   		topology_id = find_acpi_cpu_topology_package(cpu);

>>   		cpu_topology[cpu].package_id = topology_id;

>> +

>> +		i = acpi_find_last_cache_level(cpu);

>> +

>> +		if (i > 0) {

>> +			topology_id = find_acpi_cpu_cache_topology(cpu, i);

>> +			if (topology_id > 0)

>> +				cpu_topology[cpu].llc_id = topology_id;

>> +		}

> 

> [nit] s/topology_id/cache_id/ or s/topology_id/cache_topology_id/ ?


Sure.

> 

> Otherwise looks fine to me. You can add with above things fixed.

> 

> Acked-by: Sudeep Holla <sudeep.holla@arm.com>

> 


Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Morten Rasmussen May 3, 2018, 11:20 a.m. UTC | #13
On Wed, May 02, 2018 at 05:32:54PM -0500, Jeremy Linton wrote:
> Hi,

> 

> On 05/02/2018 06:49 AM, Morten Rasmussen wrote:

> >On Tue, May 01, 2018 at 03:33:33PM +0100, Sudeep Holla wrote:

> >>

> >>

> >>On 26/04/18 00:31, Jeremy Linton wrote:

> >>>Now that we have an accurate view of the physical topology

> >>>we need to represent it correctly to the scheduler. Generally MC

> >>>should equal the LLC in the system, but there are a number of

> >>>special cases that need to be dealt with.

> >>>

> >>>In the case of NUMA in socket, we need to assure that the sched

> >>>domain we build for the MC layer isn't larger than the DIE above it.

> >>>Similarly for LLC's that might exist in cross socket interconnect or

> >>>directory hardware we need to assure that MC is shrunk to the socket

> >>>or NUMA node.

> >>>

> >>>This patch builds a sibling mask for the LLC, and then picks the

> >>>smallest of LLC, socket siblings, or NUMA node siblings, which

> >>>gives us the behavior described above. This is ever so slightly

> >>>different than the similar alternative where we look for a cache

> >>>layer less than or equal to the socket/NUMA siblings.

> >>>

> >>>The logic to pick the MC layer affects all arm64 machines, but

> >>>only changes the behavior for DT/MPIDR systems if the NUMA domain

> >>>is smaller than the core siblings (generally set to the cluster).

> >>>Potentially this fixes a possible bug in DT systems, but really

> >>>it only affects ACPI systems where the core siblings is correctly

> >>>set to the socket siblings. Thus all currently available ACPI

> >>>systems should have MC equal to LLC, including the NUMA in socket

> >>>machines where the LLC is partitioned between the NUMA nodes.

> >>>

> >>>Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

> >>>---

> >>>  arch/arm64/include/asm/topology.h |  2 ++

> >>>  arch/arm64/kernel/topology.c      | 32 +++++++++++++++++++++++++++++++-

> >>>  2 files changed, 33 insertions(+), 1 deletion(-)

> >>>

> >>>diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

> >>>index 6b10459e6905..df48212f767b 100644

> >>>--- a/arch/arm64/include/asm/topology.h

> >>>+++ b/arch/arm64/include/asm/topology.h

> >>>@@ -8,8 +8,10 @@ struct cpu_topology {

> >>>  	int thread_id;

> >>>  	int core_id;

> >>>  	int package_id;

> >>>+	int llc_id;

> >>>  	cpumask_t thread_sibling;

> >>>  	cpumask_t core_sibling;

> >>>+	cpumask_t llc_siblings;

> >>>  };

> >>>  extern struct cpu_topology cpu_topology[NR_CPUS];

> >>>diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c

> >>>index bd1aae438a31..20b4341dc527 100644

> >>>--- a/arch/arm64/kernel/topology.c

> >>>+++ b/arch/arm64/kernel/topology.c

> >>>@@ -13,6 +13,7 @@

> >>>  #include <linux/acpi.h>

> >>>  #include <linux/arch_topology.h>

> >>>+#include <linux/cacheinfo.h>

> >>>  #include <linux/cpu.h>

> >>>  #include <linux/cpumask.h>

> >>>  #include <linux/init.h>

> >>>@@ -214,7 +215,19 @@ EXPORT_SYMBOL_GPL(cpu_topology);

> >>>  const struct cpumask *cpu_coregroup_mask(int cpu)

> >>>  {

> >>>-	return &cpu_topology[cpu].core_sibling;

> >>>+	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

> >>>+

> >>>+	/* Find the smaller of NUMA, core or LLC siblings */

> >>>+	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {

> >>>+		/* not numa in package, lets use the package siblings */

> >>>+		core_mask = &cpu_topology[cpu].core_sibling;

> >>>+	}

> >>>+	if (cpu_topology[cpu].llc_id != -1) {

> >>>+		if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))

> >>>+			core_mask = &cpu_topology[cpu].llc_siblings;

> >>>+	}

> >>>+

> >>>+	return core_mask;

> >>>  }

> >>>  static void update_siblings_masks(unsigned int cpuid)

> >>>@@ -226,6 +239,9 @@ static void update_siblings_masks(unsigned int cpuid)

> >>>  	for_each_possible_cpu(cpu) {

> >>>  		cpu_topo = &cpu_topology[cpu];

> >>>+		if (cpuid_topo->llc_id == cpu_topo->llc_id)

> >>>+			cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);

> >>>+

> >>

> >>Would this not result in cpuid_topo->llc_siblings = cpu_possible_mask

> >>on DT systems where llc_id is not set/defaults to -1 and still pass the

> >>condition. Does it make sense to add additional -1 check ?

> >

> >I don't think mask will be used by the current code if llc_id == -1 as

> >the user does the check. Is it better to have the mask empty than

> >default to cpu_possible_mask? If we require all users to implement a

> >check it shouldn't matter.

> >

> 

> Right.

> 

> There is also the other way of thinking about it, which is if you remove the

> if llc_id == -1 check in cpu_coregroup_mask() does it make more sense to

> have llc_siblings default equal all the cores, or just the one being

> requested?


Since we define cpu_coregroup_mask() to be the smallest of LLC, package,
and NUMA node, letting it default to just one cpu would change/break the
topology on non-PPTT systems. Wouldn't it?

If we want to drop the check llc_siblings should be default to either
core_siblings or cpumask_of_node(). But I don't really see the point as
any user of llc_siblings that really care about where the LLC is
would have to check if llc_sibling is just assigned a default value or
it is indeed representing the LLC. I'm fine with just expecting the user
to check llc_id to see if the llc_sibling mask is valid or not.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Morten Rasmussen May 3, 2018, 3:12 p.m. UTC | #14
On Wed, Apr 25, 2018 at 06:31:21PM -0500, Jeremy Linton wrote:
> Now that we have an accurate view of the physical topology

> we need to represent it correctly to the scheduler. Generally MC

> should equal the LLC in the system, but there are a number of

> special cases that need to be dealt with.

> 

> In the case of NUMA in socket, we need to assure that the sched

> domain we build for the MC layer isn't larger than the DIE above it.

> Similarly for LLC's that might exist in cross socket interconnect or

> directory hardware we need to assure that MC is shrunk to the socket

> or NUMA node.

> 

> This patch builds a sibling mask for the LLC, and then picks the

> smallest of LLC, socket siblings, or NUMA node siblings, which

> gives us the behavior described above. This is ever so slightly

> different than the similar alternative where we look for a cache

> layer less than or equal to the socket/NUMA siblings.

> 

> The logic to pick the MC layer affects all arm64 machines, but

> only changes the behavior for DT/MPIDR systems if the NUMA domain

> is smaller than the core siblings (generally set to the cluster).

> Potentially this fixes a possible bug in DT systems, but really

> it only affects ACPI systems where the core siblings is correctly

> set to the socket siblings. Thus all currently available ACPI

> systems should have MC equal to LLC, including the NUMA in socket

> machines where the LLC is partitioned between the NUMA nodes.

> 

> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>


This patch looks good to me.

Acked-by: Morten Rasmussen <morten.rasmussen@arm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
vkilari@codeaurora.org May 4, 2018, 8:10 a.m. UTC | #15
> -----Original Message-----

> From: Jeremy Linton <jeremy.linton@arm.com>

> Sent: Thursday, April 26, 2018 5:01 AM

> To: linux-acpi@vger.kernel.org

> Cc: Sudeep.Holla@arm.com; linux-arm-kernel@lists.infradead.org;

> Lorenzo.Pieralisi@arm.com; hanjun.guo@linaro.org; rjw@rjwysocki.net;

> Will.Deacon@arm.com; Catalin.Marinas@arm.com;

> gregkh@linuxfoundation.org; Mark.Rutland@arm.com; linux-

> kernel@vger.kernel.org; linux-riscv@lists.infradead.org;

> wangxiongfeng2@huawei.com; vkilari@codeaurora.org; ahs3@redhat.com;

> Dietmar.Eggemann@arm.com; Morten.Rasmussen@arm.com;

> palmer@sifive.com; lenb@kernel.org; john.garry@huawei.com;

> austinwc@codeaurora.org; tnowicki@caviumnetworks.com;

> jhugo@qti.qualcomm.com; timur@qti.qualcomm.com;

> ard.biesheuvel@linaro.org; Jeremy Linton <jeremy.linton@arm.com>

> Subject: [PATCH v8 00/13] Support PPTT for ARM64

> 

> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is

used to
> describe the processor and cache topology. Ideally it is used to

extend/override
> information provided by the hardware, but right now ARM64 is entirely

> dependent on firmware provided tables.

> 

> This patch parses the table for the cache topology and CPU topology.

> When we enable ACPI/PPTT for arm64 we map the package_id to the PPTT

> node flagged as the physical package by the firmware.

> This results in topologies that match what the remainder of the system

expects.
> Finally, we update the scheduler MC domain so that it generally reflects

the
> LLC unless the LLC is too large for the NUMA domain (or package).

> 

> For example on juno:

> [root@mammon-juno-rh topology]# lstopo-no-graphics

>   Package L#0

>     L2 L#0 (1024KB)

>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)

>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)

>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)

>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)

>     L2 L#1 (2048KB)

>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)

>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)

>   HostBridge L#0

>     PCIBridge

>       PCIBridge

>         PCIBridge

>           PCI 1095:3132

>             Block(Disk) L#0 "sda"

>         PCIBridge

>           PCI 1002:68f9

>             GPU L#1 "renderD128"

>             GPU L#2 "card0"

>             GPU L#3 "controlD64"

>         PCIBridge

>           PCI 11ab:4380

>             Net L#4 "enp8s0"

> 

> Git tree at:

> http://linux-arm.org/git?p=linux-jlinton.git

> branch: pptt_v8


Tested-by: Vijaya Kumar K <vkilari@codeaurora.org>


> 

> v7->v8:

> Modify the logic used to select the MC domain (the change

>   shouldn't modify the sched domains on any existing machines

>   compared to v7, only how they are built) Reduce the severity of some

parsing
> messages.

> Fix s390 link problem.

> Further checks to deal with broken PPTT tables.

> Various style tweaks, SPDX license addition, etc.

> 

> v6->v7:

> Add additional patch to use the last cache level within the NUMA

>   or socket as the MC domain. This assures the MC domain is

>   equal or smaller than the DIE.

> 

> Various formatting/etc review comments.

> 

> Rebase to 4.16rc2

> 

> v5->v6:

> Add additional patches which re-factor how the initial DT code sets

>   up the cacheinfo structure so that its not as dependent on the

>   of_node stored in that tree. Once that is done we rename it

>   for use with the ACPI code.

> 

> Additionally there were a fair number of minor name/location/etc

>   tweaks scattered about made in response to review comments.

> 

> v4->v5:

> Update the cache type from NOCACHE to UNIFIED when all the cache

>   attributes we update are valid. This fixes a problem where caches

>   which are entirely created by the PPTT don't show up in lstopo.

> 

> Give the PPTT its own firmware_node in the cache structure instead of

>   sharing it with the of_node.

> 

> Move some pieces around between patches.

> 

> (see previous cover letters for futher changes)

> 

> Jeremy Linton (13):

>   drivers: base: cacheinfo: move cache_setup_of_node()

>   drivers: base: cacheinfo: setup DT cache properties early

>   cacheinfo: rename of_node to fw_token

>   arm64/acpi: Create arch specific cpu to acpi id helper

>   ACPI/PPTT: Add Processor Properties Topology Table parsing

>   ACPI: Enable PPTT support on ARM64

>   drivers: base cacheinfo: Add support for ACPI based firmware tables

>   arm64: Add support for ACPI based firmware tables

>   ACPI/PPTT: Add topology parsing code

>   arm64: topology: rename cluster_id

>   arm64: topology: enable ACPI/PPTT based CPU topology

>   ACPI: Add PPTT to injectable table list

>   arm64: topology: divorce MC scheduling domain from core_siblings

> 

>  arch/arm64/Kconfig                |   1 +

>  arch/arm64/include/asm/acpi.h     |   4 +

>  arch/arm64/include/asm/topology.h |   6 +-

>  arch/arm64/kernel/cacheinfo.c     |  15 +-

>  arch/arm64/kernel/topology.c      | 103 +++++-

>  arch/riscv/kernel/cacheinfo.c     |   1 -

>  drivers/acpi/Kconfig              |   3 +

>  drivers/acpi/Makefile             |   1 +

>  drivers/acpi/pptt.c               | 678

> ++++++++++++++++++++++++++++++++++++++

>  drivers/acpi/tables.c             |   2 +-

>  drivers/base/cacheinfo.c          | 157 ++++-----

>  include/linux/acpi.h              |   4 +

>  include/linux/cacheinfo.h         |  18 +-

>  13 files changed, 886 insertions(+), 107 deletions(-)  create mode 100644

> drivers/acpi/pptt.c

> 

> --

> 2.13.6



--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiongfeng Wang May 4, 2018, 11:34 a.m. UTC | #16
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>


Tested on D05 board. 'lscpu' prints the following info:
localhost:~ # lscpu
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    1
Core(s) per socket:    32
Socket(s):             2
NUMA node(s):          4
L1d cache:             32K
L1i cache:             48K
L2 cache:              1024K
L3 cache:              16384K
NUMA node0 CPU(s):     0-15
NUMA node1 CPU(s):     16-31
NUMA node2 CPU(s):     32-47
NUMA node3 CPU(s):     48-63

'sched_domain' is as follows
localhost:~ # cat /proc/schedstat
version 15
timestamp 4294936236
cpu0 0 0 0 0 0 0 2471285600 2634828800 4813
domain0 00000000,0000ffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


On 2018/4/26 7:31, Jeremy Linton wrote:
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is

> used to describe the processor and cache topology. Ideally it is

> used to extend/override information provided by the hardware, but

> right now ARM64 is entirely dependent on firmware provided tables.

> 

> This patch parses the table for the cache topology and CPU topology.

> When we enable ACPI/PPTT for arm64 we map the package_id to the

> PPTT node flagged as the physical package by the firmware.

> This results in topologies that match what the remainder of the

> system expects. Finally, we update the scheduler MC domain so that

> it generally reflects the LLC unless the LLC is too large for the

> NUMA domain (or package).

> 

> For example on juno:

> [root@mammon-juno-rh topology]# lstopo-no-graphics

>   Package L#0

>     L2 L#0 (1024KB)

>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)

>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)

>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)

>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)

>     L2 L#1 (2048KB)

>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)

>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)

>   HostBridge L#0

>     PCIBridge

>       PCIBridge

>         PCIBridge

>           PCI 1095:3132

>             Block(Disk) L#0 "sda"

>         PCIBridge

>           PCI 1002:68f9

>             GPU L#1 "renderD128"

>             GPU L#2 "card0"

>             GPU L#3 "controlD64"

>         PCIBridge

>           PCI 11ab:4380

>             Net L#4 "enp8s0"

> 

> Git tree at:

> http://linux-arm.org/git?p=linux-jlinton.git

> branch: pptt_v8

> 

> v7->v8:

> Modify the logic used to select the MC domain (the change

>   shouldn't modify the sched domains on any existing machines

>   compared to v7, only how they are built)

> Reduce the severity of some parsing messages.

> Fix s390 link problem.

> Further checks to deal with broken PPTT tables.

> Various style tweaks, SPDX license addition, etc.

> 

> v6->v7:

> Add additional patch to use the last cache level within the NUMA

>   or socket as the MC domain. This assures the MC domain is

>   equal or smaller than the DIE.

>   

> Various formatting/etc review comments.

> 

> Rebase to 4.16rc2

> 

> v5->v6:

> Add additional patches which re-factor how the initial DT code sets

>   up the cacheinfo structure so that its not as dependent on the

>   of_node stored in that tree. Once that is done we rename it

>   for use with the ACPI code.

> 

> Additionally there were a fair number of minor name/location/etc

>   tweaks scattered about made in response to review comments.

> 

> v4->v5:

> Update the cache type from NOCACHE to UNIFIED when all the cache

>   attributes we update are valid. This fixes a problem where caches

>   which are entirely created by the PPTT don't show up in lstopo.

> 

> Give the PPTT its own firmware_node in the cache structure instead of

>   sharing it with the of_node.

> 

> Move some pieces around between patches.

> 

> (see previous cover letters for futher changes)

> 

> Jeremy Linton (13):

>   drivers: base: cacheinfo: move cache_setup_of_node()

>   drivers: base: cacheinfo: setup DT cache properties early

>   cacheinfo: rename of_node to fw_token

>   arm64/acpi: Create arch specific cpu to acpi id helper

>   ACPI/PPTT: Add Processor Properties Topology Table parsing

>   ACPI: Enable PPTT support on ARM64

>   drivers: base cacheinfo: Add support for ACPI based firmware tables

>   arm64: Add support for ACPI based firmware tables

>   ACPI/PPTT: Add topology parsing code

>   arm64: topology: rename cluster_id

>   arm64: topology: enable ACPI/PPTT based CPU topology

>   ACPI: Add PPTT to injectable table list

>   arm64: topology: divorce MC scheduling domain from core_siblings

> 

>  arch/arm64/Kconfig                |   1 +

>  arch/arm64/include/asm/acpi.h     |   4 +

>  arch/arm64/include/asm/topology.h |   6 +-

>  arch/arm64/kernel/cacheinfo.c     |  15 +-

>  arch/arm64/kernel/topology.c      | 103 +++++-

>  arch/riscv/kernel/cacheinfo.c     |   1 -

>  drivers/acpi/Kconfig              |   3 +

>  drivers/acpi/Makefile             |   1 +

>  drivers/acpi/pptt.c               | 678 ++++++++++++++++++++++++++++++++++++++

>  drivers/acpi/tables.c             |   2 +-

>  drivers/base/cacheinfo.c          | 157 ++++-----

>  include/linux/acpi.h              |   4 +

>  include/linux/cacheinfo.h         |  18 +-

>  13 files changed, 886 insertions(+), 107 deletions(-)

>  create mode 100644 drivers/acpi/pptt.c

> 


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sudeep Holla May 4, 2018, 11:44 a.m. UTC | #17
Hi Vijaya Kumar,

On 04/05/18 09:10, vkilari@codeaurora.org wrote:
> 

[...]

CPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
> used to

>> describe the processor and cache topology. Ideally it is used to

> extend/override


FYI, these messages go no where to the git commit log. Only individual
patch logs/messages will go and not the cover letter ones. Check if you
are happy with those.

-- 
Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html