[RFC,v2,1/2] topology: Represent clusters of CPUs within a die.

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Both ACPI and DT provide the ability to describe additional layers of
topology between that of individual cores and higher level constructs
such as the level at which the last level cache is shared.
In ACPI this can be represented in PPTT as a Processor Hierarchy
Node Structure [1] that is the parent of the CPU cores and in turn
has a parent Processor Hierarchy Nodes Structure representing
a higher level of topology.

For example Kunpeng 920 has 6 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data, but each cluster
has local L3 tag. On the other hand, each clusters will share some
internal system bus.

+-----------------------------------+                          +---------+
|  +------+    +------+            +---------------------------+         |
|  | CPU0 |    | cpu1 |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   +----+    L3     |         |         |
|  +------+    +------+   cluster   |    |    tag    |         |         |
|  | CPU2 |    | CPU3 |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   |    |    L3     |         |         |
|  +------+    +------+             +----+    tag    |         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |   L3    |
                                                               |   data  |
+-----------------------------------+                          |         |
|  +------+    +------+             |    +-----------+         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             +----+    L3     |         |         |
|                                   |    |    tag    |         |         |
|  +------+    +------+             |    |           |         |         |
|  |      |    |      |            ++    +-----------+         |         |
|  +------+    +------+            |---------------------------+         |
+-----------------------------------|                          |         |
+-----------------------------------|                          |         |
|  +------+    +------+            +---------------------------+         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   +----+    L3     |         |         |
|  +------+    +------+             |    |    tag    |         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |   +-----------+          |         |
|  +------+    +------+             |   |           |          |         |
|                                   |   |    L3     |          |         |
|  +------+    +------+             +---+    tag    |          |         |
|  |      |    |      |             |   |           |          |         |
|  +------+    +------+             |   +-----------+          |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                         ++         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |  +-----------+           |         |
|  +------+    +------+             |  |           |           |         |
|                                   |  |    L3     |           |         |
|  +------+    +------+             +--+    tag    |           |         |
|  |      |    |      |             |  |           |           |         |
|  +------+    +------+             |  +-----------+           |         |
|                                   |                          +---------+
+-----------------------------------+

That means the cost to transfer ownership of a cacheline between CPUs
within a cluster is lower than between CPUs in different clusters on
the same die. Hence, it can make sense to tell the scheduler to use
the cache affinity of the cluster to make better decision on thread
migration.

This patch simply exposes this information to userspace libraries
like hwloc by providing cluster_cpus and related sysfs attributes.
PoC of HWLOC support at [2].

Note this patch only handle the ACPI case.

Special consideration is needed for SMT processors, where it is
necessary to move 2 levels up the hierarchy from the leaf nodes
(thus skipping the processor core level).

Currently the ID provided is the offset of the Processor
Hierarchy Nodes Structure within PPTT.  Whilst this is unique
it is not terribly elegant so alternative suggestions welcome.

Note that arm64 / ACPI does not provide any means of identifying
a die level in the topology but that may be unrelate to the cluster
level.

[1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
    structure (Type 0)
[2] https://github.com/hisilicon/hwloc/tree/linux-cluster

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>

---
 -v2: no code change, just refine the commit log
 * ABI documentation to be handled seperately as precusor patch needed
   to add existing topology ABI
 * Discussion of exact naming postponed for a future patch as no
   conclusion has been reached yet

 Documentation/admin-guide/cputopology.rst | 26 +++++++++++---
 arch/arm64/kernel/topology.c              |  2 ++
 drivers/acpi/pptt.c                       | 60 +++++++++++++++++++++++++++++++
 drivers/base/arch_topology.c              | 14 ++++++++
 drivers/base/topology.c                   | 10 ++++++
 include/linux/acpi.h                      |  5 +++
 include/linux/arch_topology.h             |  5 +++
 include/linux/topology.h                  |  6 ++++
 8 files changed, 124 insertions(+), 4 deletions(-)

-- 
2.7.4

Message ID	20201201025944.18260-2-song.bao.hua@hisilicon.com
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received: by 2002:a92:5e16:0:0:0:0:0 with SMTP id s22csp5297448ilb; Mon, 30 Nov 2020 19:05:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJzQ95/sjft06MTpmilCMNWbn0NH6JxgjuGpcHyOB9Tvtxe208xZXr/W1BHvV4xStFMl6iDO X-Received: by 2002:a17:906:6987:: with SMTP id i7mr1109892ejr.18.1606791913259; Mon, 30 Nov 2020 19:05:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606791913; cv=none; d=google.com; s=arc-20160816; b=P/t6Qc+HKJaRFmhiFS5fXClw7vPHNYjhXY92CyEuah0LCRl2QmqbuBH/jQwO8KOEMp vEsH/paf4NIhh9E3Q0+AETeMBFAlFlsu2vJ7kqPlGckkdm9H8+5jn1SBJlEb3G/SSeeP TvnVwHgIuZRlFmzX+YNIaX9FR1x4MI3PoBdVAaum6Cam38kMb1sRq1sLm/xTvllR/z98 4bBMWs6yFa5Ha+inQRIxGQqBUt0Yl5rLpxs2uYUJ3kqY4+q3ow+gw9lk2muOU/MIK+8P 8lf/UNHI5UsW92vqvEv8gYtoiLyy6X9dPzfSM20Uim31f6U9uvIAc9VD+p8OnQjBx1Y0 uqTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=aRtDvgtha4u7NG+PGzN0okIClutogsaDsxvPFQ8s/J8=; b=XFfEiKOGY4xD4eunZXbNfXQSqSdsy2477rBTOShYscezlz/fPwPZd26ybUXG+erHlH D9ZIHeN907kQU1JeF5PNhG/wmk5ek7RwuMT/RezeZR2vJrvhJpeQcS3iF5Lw+jm451tU Rj/MKGkJbzD7dKnGBRjvXHz+Drlb6yNX29K3e5OjCVxPIM9hdPw2ARjMQ4WOwUGAgptI Q93x8K8Kx7T08c5sHJc7o2pCLMbYewZK7NJaOOUu0Xa3HJ1bp03o/Ny16kBRaa7Rp8J5 NLtYpzzhURSHS1jy6P4HDDGeNODzzNQiNG8Np4N3XZZAmQYxZoykjY/X7bODN+jAWyMT N31Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-acpi-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-acpi-owner@vger.kernel.org Return-Path: <linux-acpi-owner@vger.kernel.org> Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i17si183135ejo.171.2020.11.30.19.05.12; Mon, 30 Nov 2020 19:05:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-acpi-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-acpi-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-acpi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727253AbgLADFG (ORCPT <rfc822;patch@linaro.org> + 5 others); Mon, 30 Nov 2020 22:05:06 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:8895 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727242AbgLADFG (ORCPT <rfc822; linux-acpi@vger.kernel.org>); Mon, 30 Nov 2020 22:05:06 -0500 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4ClRl05dHZz75BH; Tue, 1 Dec 2020 11:03:56 +0800 (CST) Received: from SWX921481.china.huawei.com (10.126.202.198) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.487.0; Tue, 1 Dec 2020 11:04:13 +0800 From: Barry Song <song.bao.hua@hisilicon.com> To: <valentin.schneider@arm.com>, <catalin.marinas@arm.com>, <will@kernel.org>, <rjw@rjwysocki.net>, <lenb@kernel.org>, <gregkh@linuxfoundation.org>, <Jonathan.Cameron@huawei.com>, <mingo@redhat.com>, <peterz@infradead.org>, <juri.lelli@redhat.com>, <vincent.guittot@linaro.org>, <dietmar.eggemann@arm.com>, <rostedt@goodmis.org>, <bsegall@google.com>, <mgorman@suse.de>, <mark.rutland@arm.com>, <linux-arm-kernel@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <linux-acpi@vger.kernel.org> CC: <linuxarm@huawei.com>, <xuwei5@huawei.com>, <prime.zeng@hisilicon.com>, Barry Song <song.bao.hua@hisilicon.com> Subject: [RFC PATCH v2 1/2] topology: Represent clusters of CPUs within a die. Date: Tue, 1 Dec 2020 15:59:43 +1300 Message-ID: <20201201025944.18260-2-song.bao.hua@hisilicon.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20201201025944.18260-1-song.bao.hua@hisilicon.com> References: <20201201025944.18260-1-song.bao.hua@hisilicon.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.126.202.198] X-CFilter-Loop: Reflected Precedence: bulk List-ID: <linux-acpi.vger.kernel.org> X-Mailing-List: linux-acpi@vger.kernel.org
Series	[RFC,v2,1/2] topology: Represent clusters of CPUs within a die. \| expand [RFC,v2,1/2] topology: Represent clusters of CPUs within a die. [RFC,v2,2/2] scheduler: add scheduler level for clusters

[RFC,v2,1/2] topology: Represent clusters of CPUs within a die.

Commit Message

Comments

Patch