diff mbox

[v7,2/8] Documentation: arm: define DT idle states bindings

Message ID 1407945127-27554-3-git-send-email-lorenzo.pieralisi@arm.com
State Superseded
Headers show

Commit Message

Lorenzo Pieralisi Aug. 13, 2014, 3:52 p.m. UTC
ARM based platforms implement a variety of power management schemes that
allow processors to enter idle states at run-time.
The parameters defining these idle states vary on a per-platform basis forcing
the OS to hardcode the state parameters in platform specific static tables
whose size grows as the number of platforms supported in the kernel increases
and hampers device drivers standardization.

Therefore, this patch aims at standardizing idle state device tree bindings
for ARM platforms. Bindings define idle state parameters inclusive of entry
methods and state latencies, to allow operating systems to retrieve the
configuration entries from the device tree and initialize the related power
management drivers, paving the way for common code in the kernel to deal with
idle states and removing the need for static data in current and previous
kernel versions.

ARM64 platforms require the DT to define an entry-method property
for idle states.

On system implementing PSCI as an enable-method to enter low-power
states the PSCI CPU suspend method requires the power_state parameter to
be passed to the PSCI CPU suspend function.

This parameter is specific to a power state and platform specific,
therefore must be provided by firmware to the OS in order to enable
proper call sequence.

Thus, this patch also adds a property in the PSCI bindings that
describes how the PSCI CPU suspend power_state parameter should be
defined in DT in all device nodes that rely on PSCI CPU suspend method usage.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 679 +++++++++++++++++++++
 Documentation/devicetree/bindings/arm/psci.txt     |  14 +-
 3 files changed, 700 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt

Comments

Lina Iyer Aug. 13, 2014, 7:25 p.m. UTC | #1
Hi Lorenzo,

On Wed, Aug 13, 2014 at 04:52:01PM +0100, Lorenzo Pieralisi wrote:
>+===========================================
>+4 - Examples
>+===========================================
>+
>+Example 1 (ARM 64-bit, 16-cpu system, PSCI enable-method):
>+
>+cpus {
>+	#size-cells = <0>;
>+	#address-cells = <2>;
>+
>+	CPU0: cpu@0 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x0>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
Sorry for jumping in late. I havent gone through all the patches yet or
followed on previous discussions, if somebody could answer this or point
me to the discussion, it would be great.
Why is the cpu defining the possible cluster idle states? Would it be
better that cluster states form a separate node, something like this -

	CLUSTER0: cluster@0 {
		...
		cpus = <&CPU0 &CPU1 &CPU2 &CPU3>;
		cluster-idle-states = <&CLUTER_RETENTION_0, &CLUSTER_SLEEP_0>;
		};
	};	
		
Allowing for something like this to be defined - 

	super_cluster0: cluster@101 {
		...
		clusters = <&CLUSTER0  &CLUSTER1>;
		cluster-idle-states = <&SOC_RETENTION, &SOC_SLEEP>;
		};
	};

And each cluster-idle-state follows the general idle definition as
provided in this document, and an indicator what the compising
components should idle at, for this idle state to be available.

	CLUSTER_SLEEP_0: cluster-sleep@0 {
		...
		/* sleep definition for cluster0's retention */
		min-idle-state = <CPU_SLEEP_0>;
	};

	SOC_SLEEP: cluster-sleep@101 {
		...
		min-idle-state = <&CLUSTER_SLEEP_0>;
	};
		

Opens up the idle state for a lot of heirarchical possibilities, which
if you think, is generally how the SoC is. 


Thanks,
Lina
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Aug. 13, 2014, 10:11 p.m. UTC | #2
On Wed, Aug 13, 2014 at 08:25:36PM +0100, Lina Iyer wrote:
> Hi Lorenzo,
> 
> On Wed, Aug 13, 2014 at 04:52:01PM +0100, Lorenzo Pieralisi wrote:
> >+===========================================
> >+4 - Examples
> >+===========================================
> >+
> >+Example 1 (ARM 64-bit, 16-cpu system, PSCI enable-method):
> >+
> >+cpus {
> >+	#size-cells = <0>;
> >+	#address-cells = <2>;
> >+
> >+	CPU0: cpu@0 {
> >+		device_type = "cpu";
> >+		compatible = "arm,cortex-a57";
> >+		reg = <0x0 0x0>;
> >+		enable-method = "psci";
> >+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
> >+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
> >+	};
> Sorry for jumping in late. I havent gone through all the patches yet or
> followed on previous discussions, if somebody could answer this or point
> me to the discussion, it would be great.
> Why is the cpu defining the possible cluster idle states? Would it be
> better that cluster states form a separate node, something like this -
> 
> 	CLUSTER0: cluster@0 {
> 		...
> 		cpus = <&CPU0 &CPU1 &CPU2 &CPU3>;
> 		cluster-idle-states = <&CLUTER_RETENTION_0, &CLUSTER_SLEEP_0>;
> 		};
> 	};	
> 		
> Allowing for something like this to be defined - 
> 
> 	super_cluster0: cluster@101 {
> 		...
> 		clusters = <&CLUSTER0  &CLUSTER1>;
> 		cluster-idle-states = <&SOC_RETENTION, &SOC_SLEEP>;
> 		};
> 	};
> 
> And each cluster-idle-state follows the general idle definition as
> provided in this document, and an indicator what the compising
> components should idle at, for this idle state to be available.
> 
> 	CLUSTER_SLEEP_0: cluster-sleep@0 {
> 		...
> 		/* sleep definition for cluster0's retention */
> 		min-idle-state = <CPU_SLEEP_0>;
> 	};
> 
> 	SOC_SLEEP: cluster-sleep@101 {
> 		...
> 		min-idle-state = <&CLUSTER_SLEEP_0>;
> 	};
> 		
> 
> Opens up the idle state for a lot of heirarchical possibilities, which
> if you think, is generally how the SoC is. 

We have been thinking for 7 patch versions + some more for this specific
document, which is ready to go after extensive debate.

It is probably better to have a look at archives first since honestly it
is impossible to summarize 6 months worth of discussions in few lines.

I think the hierarchy you mention should be implemented using power
domains, which is how the SoC implements power management and that's
what defines idle states hierarchy.

To be 100% precise, I would like to detect what cpus are affected by an
idle state entry by defining for each idle state what power domain
(which can be hierarchical) is affected, not by grouping them under
a tag "cluster" "supercluster" or whatchamacallit.

I removed power domains to simplify the current proposal which is sufficient
as a starting point, but they are next on my TODO list and were part of the
initial bindings, consider yourself welcome to help us define the way forward
keeping this document as a starting point.

Lorenzo
Lina Iyer Aug. 15, 2014, 5:20 p.m. UTC | #3
On Wed, Aug 13, 2014 at 04:52:01PM +0100, Lorenzo Pieralisi wrote:
>ARM based platforms implement a variety of power management schemes that
>allow processors to enter idle states at run-time.
>The parameters defining these idle states vary on a per-platform basis forcing
>the OS to hardcode the state parameters in platform specific static tables
>whose size grows as the number of platforms supported in the kernel increases
>and hampers device drivers standardization.
>
>Therefore, this patch aims at standardizing idle state device tree bindings
>for ARM platforms. Bindings define idle state parameters inclusive of entry
>methods and state latencies, to allow operating systems to retrieve the
>configuration entries from the device tree and initialize the related power
>management drivers, paving the way for common code in the kernel to deal with
>idle states and removing the need for static data in current and previous
>kernel versions.
>
>ARM64 platforms require the DT to define an entry-method property
>for idle states.
>
>On system implementing PSCI as an enable-method to enter low-power
>states the PSCI CPU suspend method requires the power_state parameter to
>be passed to the PSCI CPU suspend function.
>
>This parameter is specific to a power state and platform specific,
>therefore must be provided by firmware to the OS in order to enable
>proper call sequence.
>
>Thus, this patch also adds a property in the PSCI bindings that
>describes how the PSCI CPU suspend power_state parameter should be
>defined in DT in all device nodes that rely on PSCI CPU suspend method usage.
>
>Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>Acked-by: Nicolas Pitre <nico@linaro.org>
>Reviewed-by: Rob Herring <robh@kernel.org>
>Reviewed-by: Sebastian Capella <sebcape@gmail.com>
>Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>---
> Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
> .../devicetree/bindings/arm/idle-states.txt        | 679 +++++++++++++++++++++
> Documentation/devicetree/bindings/arm/psci.txt     |  14 +-
> 3 files changed, 700 insertions(+), 1 deletion(-)
> create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt
>
>diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
>index 298e2f6..6fd0f15 100644
>--- a/Documentation/devicetree/bindings/arm/cpus.txt
>+++ b/Documentation/devicetree/bindings/arm/cpus.txt
>@@ -219,6 +219,12 @@ nodes to be present and contain the properties described below.
> 		Value type: <phandle>
> 		Definition: Specifies the ACC[2] node associated with this CPU.
>
>+	- cpu-idle-states
>+		Usage: Optional
>+		Value type: <prop-encoded-array>
>+		Definition:
>+			# List of phandles to idle state nodes supported
>+			  by this cpu [3].
>
> Example 1 (dual-cluster big.LITTLE system 32-bit):
>
>@@ -415,3 +421,5 @@ cpus {
> --
> [1] arm/msm/qcom,saw2.txt
> [2] arm/msm/qcom,kpss-acc.txt
>+[3] ARM Linux kernel documentation - idle states bindings
>+    Documentation/devicetree/bindings/arm/idle-states.txt
>diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
>new file mode 100644
>index 0000000..37375c7
>--- /dev/null
>+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
>@@ -0,0 +1,679 @@
>+==========================================
>+ARM idle states binding description
>+==========================================
>+
>+==========================================
>+1 - Introduction
>+==========================================
>+
>+ARM systems contain HW capable of managing power consumption dynamically,
>+where cores can be put in different low-power states (ranging from simple
>+wfi to power gating) according to OS PM policies. The CPU states representing
>+the range of dynamic idle states that a processor can enter at run-time, can be
>+specified through device tree bindings representing the parameters required
>+to enter/exit specific idle states on a given processor.
>+
>+According to the Server Base System Architecture document (SBSA, [3]), the
>+power states an ARM CPU can be put into are identified by the following list:
>+
>+- Running
>+- Idle_standby
>+- Idle_retention
>+- Sleep
>+- Off
>+
>+The power states described in the SBSA document define the basic CPU states on
>+top of which ARM platforms implement power management schemes that allow an OS
>+PM implementation to put the processor in different idle states (which include
>+states listed above; "off" state is not an idle state since it does not have
>+wake-up capabilities, hence it is not considered in this document).
>+
>+Idle state parameters (eg entry latency) are platform specific and need to be
>+characterized with bindings that provide the required information to OS PM
>+code so that it can build the required tables and use them at runtime.
>+
>+The device tree binding definition for ARM idle states is the subject of this
>+document.
>+
>+===========================================
>+2 - idle-states definitions
>+===========================================
>+
>+Idle states are characterized for a specific system through a set of
>+timing and energy related properties, that underline the HW behaviour
>+triggered upon idle states entry and exit.
>+
>+The following diagram depicts the CPU execution phases and related timing
>+properties required to enter and exit an idle state:
>+
>+..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
>+	    |          |           |          |          |
>+
>+	    |<------ entry ------->|
>+	    |       latency        |
>+					      |<- exit ->|
>+					      |  latency |
>+	    |<-------- min-residency -------->|
>+		       |<-------  wakeup-latency ------->|
>+
>+		Diagram 1: CPU idle state execution phases
>+
>+EXEC:	Normal CPU execution.
>+
>+PREP:	Preparation phase before committing the hardware to idle mode
>+	like cache flushing. This is abortable on pending wake-up
>+	event conditions. The abort latency is assumed to be negligible
>+	(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
>+	goes back to EXEC. This phase is optional. If not abortable,
>+	this should be included in the ENTRY phase instead.
>+
>+ENTRY:	The hardware is committed to idle mode. This period must run
>+	to completion up to IDLE before anything else can happen.
>+
>+IDLE:	This is the actual energy-saving idle period. This may last
>+	between 0 and infinite time, until a wake-up event occurs.
>+
>+EXIT:	Period during which the CPU is brought back to operational
>+	mode (EXEC).
>+
>+entry-latency: Worst case latency required to enter the idle state. The
>+exit-latency may be guaranteed only after entry-latency has passed.
>+
>+min-residency: Minimum period, including preparation and entry, for a given
>+idle state to be worthwhile energywise.
>+
>+wakeup-latency: Maximum delay between the signaling of a wake-up event and the
>+CPU being able to execute normal code again. If not specified, this is assumed
>+to be entry-latency + exit-latency.
>+
>+These timing parameters can be used by an OS in different circumstances.
>+
>+An idle CPU requires the expected min-residency time to select the most
>+appropriate idle state based on the expected expiry time of the next IRQ
>+(ie wake-up) that causes the CPU to return to the EXEC phase.
>+
>+An operating system scheduler may need to compute the shortest wake-up delay
>+for CPUs in the system by detecting how long will it take to get a CPU out
>+of an idle state, eg:
>+
>+wakeup-delay = exit-latency + max(entry-latency - (now - entry-timestamp), 0)
>+
>+In other words, the scheduler can make its scheduling decision by selecting
>+(eg waking-up) the CPU with the shortest wake-up latency.
>+The wake-up latency must take into account the entry latency if that period
>+has not expired. The abortable nature of the PREP period can be ignored
>+if it cannot be relied upon (e.g. the PREP deadline may occur much sooner than
>+the worst case since it depends on the CPU operating conditions, ie caches
>+state).
>+
>+An OS has to reliably probe the wakeup-latency since some devices can enforce
>+latency constraints guarantees to work properly, so the OS has to detect the
>+worst case wake-up latency it can incur if a CPU is allowed to enter an
>+idle state, and possibly to prevent that to guarantee reliable device
>+functioning.
>+
>+The min-residency time parameter deserves further explanation since it is
>+expressed in time units but must factor in energy consumption coefficients.
>+
>+The energy consumption of a cpu when it enters a power state can be roughly
>+characterised by the following graph:
>+
>+               |
>+               |
>+               |
>+           e   |
>+           n   |                                      /---
>+           e   |                               /------
>+           r   |                        /------
>+           g   |                  /-----
>+           y   |           /------
>+               |       ----
>+               |      /|
>+               |     / |
>+               |    /  |
>+               |   /   |
>+               |  /    |
>+               | /     |
>+               |/      |
>+          -----|-------+----------------------------------
>+              0|       1                              time(ms)
>+
>+		Graph 1: Energy vs time example
>+
>+The graph is split in two parts delimited by time 1ms on the X-axis.
>+The graph curve with X-axis values = { x | 0 < x < 1ms } has a steep slope
>+and denotes the energy costs incurred whilst entering and leaving the idle
>+state.
>+The graph curve in the area delimited by X-axis values = {x | x > 1ms } has
>+shallower slope and essentially represents the energy consumption of the idle
>+state.
>+
>+min-residency is defined for a given idle state as the minimum expected
>+residency time for a state (inclusive of preparation and entry) after
>+which choosing that state become the most energy efficient option. A good
>+way to visualise this, is by taking the same graph above and comparing some
>+states energy consumptions plots.
>+
>+For sake of simplicity, let's consider a system with two idle states IDLE1,
>+and IDLE2:
>+
>+          |
>+          |
>+          |
>+          |                                                  /-- IDLE1
>+       e  |                                              /---
>+       n  |                                         /----
>+       e  |                                     /---
>+       r  |                                /-----/--------- IDLE2
>+       g  |                    /-------/---------
>+       y  |        ------------    /---|
>+          |       /           /----    |
>+          |      /        /---         |
>+          |     /    /----             |
>+          |    / /---                  |
>+          |   ---                      |
>+          |  /                         |
>+          | /                          |
>+          |/                           |                  time
>+       ---/----------------------------+------------------------
>+          |IDLE1-energy < IDLE2-energy | IDLE2-energy < IDLE1-energy
>+                                       |
>+                                IDLE2-min-residency
>+
>+		Graph 2: idle states min-residency example
>+
>+In graph 2 above, that takes into account idle states entry/exit energy
>+costs, it is clear that if the idle state residency time (ie time till next
>+wake-up IRQ) is less than IDLE2-min-residency, IDLE1 is the better idle state
>+choice energywise.
>+
>+This is mainly down to the fact that IDLE1 entry/exit energy costs are lower
>+than IDLE2.
>+
>+However, the lower power consumption (ie shallower energy curve slope) of idle
>+state IDLE2 implies that after a suitable time, IDLE2 becomes more energy
>+efficient.
>+
>+The time at which IDLE2 becomes more energy efficient than IDLE1 (and other
>+shallower states in a system with multiple idle states) is defined
>+IDLE2-min-residency and corresponds to the time when energy consumption of
>+IDLE1 and IDLE2 states breaks even.
>+
>+The definitions provided in this section underpin the idle states
>+properties specification that is the subject of the following sections.
>+
>+===========================================
>+3 - idle-states node
>+===========================================
>+
>+ARM processor idle states are defined within the idle-states node, which is
>+a direct child of the cpus node [1] and provides a container where the
>+processor idle states, defined as device tree nodes, are listed.
>+
>+- idle-states node
>+
>+	Usage: Optional - On ARM systems, it is a container of processor idle
>+			  states nodes. If the system does not provide CPU
>+			  power management capabilities or the processor just
>+			  supports idle_standby an idle-states node is not
>+			  required.
>+
>+	Description: idle-states node is a container node, where its
>+		     subnodes describe the CPU idle states.
>+
>+	Node name must be "idle-states".
>+
>+	The idle-states node's parent node must be the cpus node.
>+
>+	The idle-states node's child nodes can be:
>+
>+	- one or more state nodes
>+
>+	Any other configuration is considered invalid.
>+
>+	An idle-states node defines the following properties:
>+
>+	- entry-method
>+		Value type: <stringlist>
>+		Usage and definition depend on ARM architecture version.
>+			# On ARM v8 64-bit this property is required and must
>+			  be one of:
>+			   - "psci" (see bindings in [2])
>+			# On ARM 32-bit systems this property is optional
>+
>+The nodes describing the idle states (state) can only be defined within the
>+idle-states node, any other configuration is considered invalid and therefore
>+must be ignored.
>+
>+===========================================
>+4 - state node
>+===========================================
>+
>+A state node represents an idle state description and must be defined as
>+follows:
>+
>+- state node
>+
>+	Description: must be child of the idle-states node
>+
>+	The state node name shall follow standard device tree naming
>+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
>+	are siblings within a single common parent must be given a unique name.
>+
>+	The idle state entered by executing the wfi instruction (idle_standby
>+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
>+	must not be listed.
>+
>+	With the definitions provided above, the following list represents
>+	the valid properties for a state node:
>+
>+	- compatible
>+		Usage: Required
>+		Value type: <stringlist>
>+		Definition: Must be "arm,idle-state".
>+
>+	- local-timer-stop
>+		Usage: See definition
>+		Value type: <none>
>+		Definition: if present the CPU local timer control logic is
>+			    lost on state entry, otherwise it is retained.
>+
>+	- entry-latency-us
>+		Usage: Required
>+		Value type: <prop-encoded-array>
>+		Definition: u32 value representing worst case latency in
>+			    microseconds required to enter the idle state.
>+			    The exit-latency-us duration may be guaranteed
>+			    only after entry-latency-us has passed.
>+
>+	- exit-latency-us
>+		Usage: Required
>+		Value type: <prop-encoded-array>
>+		Definition: u32 value representing worst case latency
>+			    in microseconds required to exit the idle state.
>+
>+	- min-residency-us
>+		Usage: Required
>+		Value type: <prop-encoded-array>
>+		Definition: u32 value representing minimum residency duration
>+			    in microseconds, inclusive of preparation and
>+			    entry, for this idle state to be considered
>+			    worthwhile energy wise (refer to section 2 of
>+			    this document for a complete description).
>+
>+	- wakeup-latency-us:
>+		Usage: Optional
>+		Value type: <prop-encoded-array>
>+		Definition: u32 value representing maximum delay between the
>+			    signaling of a wake-up event and the CPU being
>+			    able to execute normal code again. If omitted,
>+			    this is assumed to be equal to:
>+
>+				entry-latency-us + exit-latency-us
>+
>+			    It is important to supply this value on systems
>+			    where the duration of PREP phase (see diagram 1,
>+			    section 2) is non-neglibigle.
>+			    In such systems entry-latency-us + exit-latency-us
>+			    will exceed wakeup-latency-us by this duration.
>+
>+	In addition to the properties listed above, a state node may require
>+	additional properties specifics to the entry-method defined in the
>+	idle-states node, please refer to the entry-method bindings
>+	documentation for properties definitions.
How are the different idle-states node differenciated?
Say, I want to choose a different entry point function for each of these
node sepately.
Also, not all targets may support all idle states. If we do not want to enter
retention on one target, while an other target might, there is no way
to know what the idle state node is to set up the correct entry function. 

>+
>+===========================================
>+4 - Examples
>+===========================================
>+
>+Example 1 (ARM 64-bit, 16-cpu system, PSCI enable-method):
>+
>+cpus {
>+	#size-cells = <0>;
>+	#address-cells = <2>;
>+
>+	CPU0: cpu@0 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x0>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU1: cpu@1 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x1>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU2: cpu@100 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x100>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU3: cpu@101 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x101>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU4: cpu@10000 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x10000>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU5: cpu@10001 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x10001>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU6: cpu@10100 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x10100>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU7: cpu@10101 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a57";
>+		reg = <0x0 0x10101>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
>+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU8: cpu@100000000 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x0>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU9: cpu@100000001 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x1>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU10: cpu@100000100 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x100>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU11: cpu@100000101 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x101>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU12: cpu@100010000 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x10000>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU13: cpu@100010001 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x10001>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU14: cpu@100010100 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x10100>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU15: cpu@100010101 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a53";
>+		reg = <0x1 0x10101>;
>+		enable-method = "psci";
>+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
>+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	idle-states {
>+		entry-method = "arm,psci";
>+
>+		CPU_RETENTION_0_0: cpu-retention-0-0 {
>+			compatible = "arm,idle-state";
>+			arm,psci-suspend-param = <0x0010000>;
>+			entry-latency-us = <20>;
>+			exit-latency-us = <40>;
>+			min-residency-us = <80>;
>+		};
>+
>+		CLUSTER_RETENTION_0: cluster-retention-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			arm,psci-suspend-param = <0x1010000>;
>+			entry-latency-us = <50>;
>+			exit-latency-us = <100>;
>+			min-residency-us = <250>;
>+			wakeup-latency-us = <130>;
>+		};
>+
>+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			arm,psci-suspend-param = <0x0010000>;
>+			entry-latency-us = <250>;
>+			exit-latency-us = <500>;
>+			min-residency-us = <950>;
>+		};
>+
>+		CLUSTER_SLEEP_0: cluster-sleep-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			arm,psci-suspend-param = <0x1010000>;
>+			entry-latency-us = <600>;
>+			exit-latency-us = <1100>;
>+			min-residency-us = <2700>;
>+			wakeup-latency-us = <1500>;
>+		};
>+
>+		CPU_RETENTION_1_0: cpu-retention-1-0 {
>+			compatible = "arm,idle-state";
>+			arm,psci-suspend-param = <0x0010000>;
>+			entry-latency-us = <20>;
>+			exit-latency-us = <40>;
>+			min-residency-us = <90>;
>+		};
>+
>+		CLUSTER_RETENTION_1: cluster-retention-1 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			arm,psci-suspend-param = <0x1010000>;
>+			entry-latency-us = <50>;
>+			exit-latency-us = <100>;
>+			min-residency-us = <270>;
>+			wakeup-latency-us = <100>;
>+		};
>+
>+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			arm,psci-suspend-param = <0x0010000>;
>+			entry-latency-us = <70>;
>+			exit-latency-us = <100>;
>+			min-residency-us = <300>;
>+			wakeup-latency-us = <150>;
>+		};
>+
>+		CLUSTER_SLEEP_1: cluster-sleep-1 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			arm,psci-suspend-param = <0x1010000>;
>+			entry-latency-us = <500>;
>+			exit-latency-us = <1200>;
>+			min-residency-us = <3500>;
>+			wakeup-latency-us = <1300>;
>+		};
>+	};
>+
>+};
>+
>+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
>+
>+cpus {
>+	#size-cells = <0>;
>+	#address-cells = <1>;
>+
>+	CPU0: cpu@0 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a15";
>+		reg = <0x0>;
>+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU1: cpu@1 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a15";
>+		reg = <0x1>;
>+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU2: cpu@2 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a15";
>+		reg = <0x2>;
>+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU3: cpu@3 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a15";
>+		reg = <0x3>;
>+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
>+	};
>+
>+	CPU4: cpu@100 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a7";
>+		reg = <0x100>;
>+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU5: cpu@101 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a7";
>+		reg = <0x101>;
>+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU6: cpu@102 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a7";
>+		reg = <0x102>;
>+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	CPU7: cpu@103 {
>+		device_type = "cpu";
>+		compatible = "arm,cortex-a7";
>+		reg = <0x103>;
>+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
>+	};
>+
>+	idle-states {
>+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			entry-latency-us = <200>;
>+			exit-latency-us = <100>;
>+			min-residency-us = <400>;
>+			wakeup-latency-us = <250>;
>+		};
>+
>+		CLUSTER_SLEEP_0: cluster-sleep-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			entry-latency-us = <500>;
>+			exit-latency-us = <1500>;
>+			min-residency-us = <2500>;
>+			wakeup-latency-us = <1700>;
>+		};
>+
>+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			entry-latency-us = <300>;
>+			exit-latency-us = <500>;
>+			min-residency-us = <900>;
>+			wakeup-latency-us = <600>;
>+		};
>+
>+		CLUSTER_SLEEP_1: cluster-sleep-1 {
>+			compatible = "arm,idle-state";
>+			local-timer-stop;
>+			entry-latency-us = <800>;
>+			exit-latency-us = <2000>;
>+			min-residency-us = <6500>;
>+			wakeup-latency-us = <2300>;
>+		};
>+	};
>+
>+};
>+
>+===========================================
>+5 - References
>+===========================================
>+
>+[1] ARM Linux Kernel documentation - CPUs bindings
>+    Documentation/devicetree/bindings/arm/cpus.txt
>+
>+[2] ARM Linux Kernel documentation - PSCI bindings
>+    Documentation/devicetree/bindings/arm/psci.txt
>+
>+[3] ARM Server Base System Architecture (SBSA)
>+    http://infocenter.arm.com/help/index.jsp
>+
>+[4] ARM Architecture Reference Manuals
>+    http://infocenter.arm.com/help/index.jsp
>+
>+[5] ePAPR standard
>+    https://www.power.org/documentation/epapr-version-1-1/
>diff --git a/Documentation/devicetree/bindings/arm/psci.txt b/Documentation/devicetree/bindings/arm/psci.txt
>index b4a58f3..5aa40ed 100644
>--- a/Documentation/devicetree/bindings/arm/psci.txt
>+++ b/Documentation/devicetree/bindings/arm/psci.txt
>@@ -50,6 +50,16 @@ Main node optional properties:
>
>  - migrate       : Function ID for MIGRATE operation
>
>+Device tree nodes that require usage of PSCI CPU_SUSPEND function (ie idle
>+state nodes, as per bindings in [1]) must specify the following properties:
>+
>+- arm,psci-suspend-param
>+		Usage: Required for state nodes[1] if the corresponding
>+                       idle-states node entry-method property is set
>+                       to "psci".
>+		Value type: <u32>
>+		Definition: power_state parameter to pass to the PSCI
>+			    suspend call.
>
> Example:
>
>@@ -64,7 +74,6 @@ Case 1: PSCI v0.1 only.
> 		migrate		= <0x95c10003>;
> 	};
>
>-
> Case 2: PSCI v0.2 only
>
> 	psci {
>@@ -88,3 +97,6 @@ Case 3: PSCI v0.2 and PSCI v0.1.
>
> 		...
> 	};
>+
>+[1] Kernel documentation - ARM idle states bindings
>+    Documentation/devicetree/bindings/arm/idle-states.txt
>-- 
>1.9.1
>
>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Aug. 15, 2014, 5:51 p.m. UTC | #4
On Fri, Aug 15, 2014 at 06:20:30PM +0100, Lina Iyer wrote:

[...]

> >+===========================================
> >+3 - idle-states node
> >+===========================================
> >+
> >+ARM processor idle states are defined within the idle-states node, which is
> >+a direct child of the cpus node [1] and provides a container where the
> >+processor idle states, defined as device tree nodes, are listed.
> >+
> >+- idle-states node
> >+
> >+      Usage: Optional - On ARM systems, it is a container of processor idle
> >+                        states nodes. If the system does not provide CPU
> >+                        power management capabilities or the processor just
> >+                        supports idle_standby an idle-states node is not
> >+                        required.
> >+
> >+      Description: idle-states node is a container node, where its
> >+                   subnodes describe the CPU idle states.
> >+
> >+      Node name must be "idle-states".
> >+
> >+      The idle-states node's parent node must be the cpus node.
> >+
> >+      The idle-states node's child nodes can be:
> >+
> >+      - one or more state nodes
> >+
> >+      Any other configuration is considered invalid.
> >+
> >+      An idle-states node defines the following properties:
> >+
> >+      - entry-method
> >+              Value type: <stringlist>
> >+              Usage and definition depend on ARM architecture version.
> >+                      # On ARM v8 64-bit this property is required and must
> >+                        be one of:
> >+                         - "psci" (see bindings in [2])
> >+                      # On ARM 32-bit systems this property is optional
> >+
> >+The nodes describing the idle states (state) can only be defined within the
> >+idle-states node, any other configuration is considered invalid and therefore
> >+must be ignored.
> >+
> >+===========================================
> >+4 - state node
> >+===========================================
> >+
> >+A state node represents an idle state description and must be defined as
> >+follows:
> >+
> >+- state node
> >+
> >+      Description: must be child of the idle-states node
> >+
> >+      The state node name shall follow standard device tree naming
> >+      rules ([5], 2.2.1 "Node names"), in particular state nodes which
> >+      are siblings within a single common parent must be given a unique name.
> >+
> >+      The idle state entered by executing the wfi instruction (idle_standby
> >+      SBSA,[3][4]) is considered standard on all ARM platforms and therefore
> >+      must not be listed.
> >+
> >+      With the definitions provided above, the following list represents
> >+      the valid properties for a state node:
> >+
> >+      - compatible
> >+              Usage: Required
> >+              Value type: <stringlist>
> >+              Definition: Must be "arm,idle-state".
> >+
> >+      - local-timer-stop
> >+              Usage: See definition
> >+              Value type: <none>
> >+              Definition: if present the CPU local timer control logic is
> >+                          lost on state entry, otherwise it is retained.
> >+
> >+      - entry-latency-us
> >+              Usage: Required
> >+              Value type: <prop-encoded-array>
> >+              Definition: u32 value representing worst case latency in
> >+                          microseconds required to enter the idle state.
> >+                          The exit-latency-us duration may be guaranteed
> >+                          only after entry-latency-us has passed.
> >+
> >+      - exit-latency-us
> >+              Usage: Required
> >+              Value type: <prop-encoded-array>
> >+              Definition: u32 value representing worst case latency
> >+                          in microseconds required to exit the idle state.
> >+
> >+      - min-residency-us
> >+              Usage: Required
> >+              Value type: <prop-encoded-array>
> >+              Definition: u32 value representing minimum residency duration
> >+                          in microseconds, inclusive of preparation and
> >+                          entry, for this idle state to be considered
> >+                          worthwhile energy wise (refer to section 2 of
> >+                          this document for a complete description).
> >+
> >+      - wakeup-latency-us:
> >+              Usage: Optional
> >+              Value type: <prop-encoded-array>
> >+              Definition: u32 value representing maximum delay between the
> >+                          signaling of a wake-up event and the CPU being
> >+                          able to execute normal code again. If omitted,
> >+                          this is assumed to be equal to:
> >+
> >+                              entry-latency-us + exit-latency-us
> >+
> >+                          It is important to supply this value on systems
> >+                          where the duration of PREP phase (see diagram 1,
> >+                          section 2) is non-neglibigle.
> >+                          In such systems entry-latency-us + exit-latency-us
> >+                          will exceed wakeup-latency-us by this duration.
> >+
> >+      In addition to the properties listed above, a state node may require
> >+      additional properties specifics to the entry-method defined in the
> >+      idle-states node, please refer to the entry-method bindings
> >+      documentation for properties definitions.
> How are the different idle-states node differenciated?
> Say, I want to choose a different entry point function for each of these
> node sepately.

You add an idle state entry method specific parameter (as PSCI does,
and it's part of this patch already) to each idle state and document it
in the respective binding.

> Also, not all targets may support all idle states. If we do not want to enter
> retention on one target, while an other target might, there is no way
> to know what the idle state node is to set up the correct entry function.

What's a target ? A CPU ? Every CPU lists its idle states through the
cpu-idle-states phandle list. If a CPU does not support an idle state it must
not be there.

Lorenzo
Catalin Marinas Aug. 18, 2014, 2:20 p.m. UTC | #5
On Wed, Aug 13, 2014 at 04:52:01PM +0100, Lorenzo Pieralisi wrote:
> ARM based platforms implement a variety of power management schemes that
> allow processors to enter idle states at run-time.
> The parameters defining these idle states vary on a per-platform basis forcing
> the OS to hardcode the state parameters in platform specific static tables
> whose size grows as the number of platforms supported in the kernel increases
> and hampers device drivers standardization.
> 
> Therefore, this patch aims at standardizing idle state device tree bindings
> for ARM platforms. Bindings define idle state parameters inclusive of entry
> methods and state latencies, to allow operating systems to retrieve the
> configuration entries from the device tree and initialize the related power
> management drivers, paving the way for common code in the kernel to deal with
> idle states and removing the need for static data in current and previous
> kernel versions.
> 
> ARM64 platforms require the DT to define an entry-method property
> for idle states.
> 
> On system implementing PSCI as an enable-method to enter low-power
> states the PSCI CPU suspend method requires the power_state parameter to
> be passed to the PSCI CPU suspend function.
> 
> This parameter is specific to a power state and platform specific,
> therefore must be provided by firmware to the OS in order to enable
> proper call sequence.
> 
> Thus, this patch also adds a property in the PSCI bindings that
> describes how the PSCI CPU suspend power_state parameter should be
> defined in DT in all device nodes that rely on PSCI CPU suspend method usage.
> 
> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Acked-by: Nicolas Pitre <nico@linaro.org>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Reviewed-by: Sebastian Capella <sebcape@gmail.com>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
>  .../devicetree/bindings/arm/idle-states.txt        | 679 +++++++++++++++++++++
>  Documentation/devicetree/bindings/arm/psci.txt     |  14 +-
>  3 files changed, 700 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
index 298e2f6..6fd0f15 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -219,6 +219,12 @@  nodes to be present and contain the properties described below.
 		Value type: <phandle>
 		Definition: Specifies the ACC[2] node associated with this CPU.
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to idle state nodes supported
+			  by this cpu [3].
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
@@ -415,3 +421,5 @@  cpus {
 --
 [1] arm/msm/qcom,saw2.txt
 [2] arm/msm/qcom,kpss-acc.txt
+[3] ARM Linux kernel documentation - idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
new file mode 100644
index 0000000..37375c7
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
@@ -0,0 +1,679 @@ 
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OS PM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+- Running
+- Idle_standby
+- Idle_retention
+- Sleep
+- Off
+
+The power states described in the SBSA document define the basic CPU states on
+top of which ARM platforms implement power management schemes that allow an OS
+PM implementation to put the processor in different idle states (which include
+states listed above; "off" state is not an idle state since it does not have
+wake-up capabilities, hence it is not considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OS PM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - idle-states definitions
+===========================================
+
+Idle states are characterized for a specific system through a set of
+timing and energy related properties, that underline the HW behaviour
+triggered upon idle states entry and exit.
+
+The following diagram depicts the CPU execution phases and related timing
+properties required to enter and exit an idle state:
+
+..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
+	    |          |           |          |          |
+
+	    |<------ entry ------->|
+	    |       latency        |
+					      |<- exit ->|
+					      |  latency |
+	    |<-------- min-residency -------->|
+		       |<-------  wakeup-latency ------->|
+
+		Diagram 1: CPU idle state execution phases
+
+EXEC:	Normal CPU execution.
+
+PREP:	Preparation phase before committing the hardware to idle mode
+	like cache flushing. This is abortable on pending wake-up
+	event conditions. The abort latency is assumed to be negligible
+	(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
+	goes back to EXEC. This phase is optional. If not abortable,
+	this should be included in the ENTRY phase instead.
+
+ENTRY:	The hardware is committed to idle mode. This period must run
+	to completion up to IDLE before anything else can happen.
+
+IDLE:	This is the actual energy-saving idle period. This may last
+	between 0 and infinite time, until a wake-up event occurs.
+
+EXIT:	Period during which the CPU is brought back to operational
+	mode (EXEC).
+
+entry-latency: Worst case latency required to enter the idle state. The
+exit-latency may be guaranteed only after entry-latency has passed.
+
+min-residency: Minimum period, including preparation and entry, for a given
+idle state to be worthwhile energywise.
+
+wakeup-latency: Maximum delay between the signaling of a wake-up event and the
+CPU being able to execute normal code again. If not specified, this is assumed
+to be entry-latency + exit-latency.
+
+These timing parameters can be used by an OS in different circumstances.
+
+An idle CPU requires the expected min-residency time to select the most
+appropriate idle state based on the expected expiry time of the next IRQ
+(ie wake-up) that causes the CPU to return to the EXEC phase.
+
+An operating system scheduler may need to compute the shortest wake-up delay
+for CPUs in the system by detecting how long will it take to get a CPU out
+of an idle state, eg:
+
+wakeup-delay = exit-latency + max(entry-latency - (now - entry-timestamp), 0)
+
+In other words, the scheduler can make its scheduling decision by selecting
+(eg waking-up) the CPU with the shortest wake-up latency.
+The wake-up latency must take into account the entry latency if that period
+has not expired. The abortable nature of the PREP period can be ignored
+if it cannot be relied upon (e.g. the PREP deadline may occur much sooner than
+the worst case since it depends on the CPU operating conditions, ie caches
+state).
+
+An OS has to reliably probe the wakeup-latency since some devices can enforce
+latency constraints guarantees to work properly, so the OS has to detect the
+worst case wake-up latency it can incur if a CPU is allowed to enter an
+idle state, and possibly to prevent that to guarantee reliable device
+functioning.
+
+The min-residency time parameter deserves further explanation since it is
+expressed in time units but must factor in energy consumption coefficients.
+
+The energy consumption of a cpu when it enters a power state can be roughly
+characterised by the following graph:
+
+               |
+               |
+               |
+           e   |
+           n   |                                      /---
+           e   |                               /------
+           r   |                        /------
+           g   |                  /-----
+           y   |           /------
+               |       ----
+               |      /|
+               |     / |
+               |    /  |
+               |   /   |
+               |  /    |
+               | /     |
+               |/      |
+          -----|-------+----------------------------------
+              0|       1                              time(ms)
+
+		Graph 1: Energy vs time example
+
+The graph is split in two parts delimited by time 1ms on the X-axis.
+The graph curve with X-axis values = { x | 0 < x < 1ms } has a steep slope
+and denotes the energy costs incurred whilst entering and leaving the idle
+state.
+The graph curve in the area delimited by X-axis values = {x | x > 1ms } has
+shallower slope and essentially represents the energy consumption of the idle
+state.
+
+min-residency is defined for a given idle state as the minimum expected
+residency time for a state (inclusive of preparation and entry) after
+which choosing that state become the most energy efficient option. A good
+way to visualise this, is by taking the same graph above and comparing some
+states energy consumptions plots.
+
+For sake of simplicity, let's consider a system with two idle states IDLE1,
+and IDLE2:
+
+          |
+          |
+          |
+          |                                                  /-- IDLE1
+       e  |                                              /---
+       n  |                                         /----
+       e  |                                     /---
+       r  |                                /-----/--------- IDLE2
+       g  |                    /-------/---------
+       y  |        ------------    /---|
+          |       /           /----    |
+          |      /        /---         |
+          |     /    /----             |
+          |    / /---                  |
+          |   ---                      |
+          |  /                         |
+          | /                          |
+          |/                           |                  time
+       ---/----------------------------+------------------------
+          |IDLE1-energy < IDLE2-energy | IDLE2-energy < IDLE1-energy
+                                       |
+                                IDLE2-min-residency
+
+		Graph 2: idle states min-residency example
+
+In graph 2 above, that takes into account idle states entry/exit energy
+costs, it is clear that if the idle state residency time (ie time till next
+wake-up IRQ) is less than IDLE2-min-residency, IDLE1 is the better idle state
+choice energywise.
+
+This is mainly down to the fact that IDLE1 entry/exit energy costs are lower
+than IDLE2.
+
+However, the lower power consumption (ie shallower energy curve slope) of idle
+state IDLE2 implies that after a suitable time, IDLE2 becomes more energy
+efficient.
+
+The time at which IDLE2 becomes more energy efficient than IDLE1 (and other
+shallower states in a system with multiple idle states) is defined
+IDLE2-min-residency and corresponds to the time when energy consumption of
+IDLE1 and IDLE2 states breaks even.
+
+The definitions provided in this section underpin the idle states
+properties specification that is the subject of the following sections.
+
+===========================================
+3 - idle-states node
+===========================================
+
+ARM processor idle states are defined within the idle-states node, which is
+a direct child of the cpus node [1] and provides a container where the
+processor idle states, defined as device tree nodes, are listed.
+
+- idle-states node
+
+	Usage: Optional - On ARM systems, it is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby an idle-states node is not
+			  required.
+
+	Description: idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "idle-states".
+
+	The idle-states node's parent node must be the cpus node.
+
+	The idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+	An idle-states node defines the following properties:
+
+	- entry-method
+		Value type: <stringlist>
+		Usage and definition depend on ARM architecture version.
+			# On ARM v8 64-bit this property is required and must
+			  be one of:
+			   - "psci" (see bindings in [2])
+			# On ARM 32-bit systems this property is optional
+
+The nodes describing the idle states (state) can only be defined within the
+idle-states node, any other configuration is considered invalid and therefore
+must be ignored.
+
+===========================================
+4 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of the idle-states node
+
+	The state node name shall follow standard device tree naming
+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
+	are siblings within a single common parent must be given a unique name.
+
+	The idle state entered by executing the wfi instruction (idle_standby
+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
+	must not be listed.
+
+	With the definitions provided above, the following list represents
+	the valid properties for a state node:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,idle-state".
+
+	- local-timer-stop
+		Usage: See definition
+		Value type: <none>
+		Definition: if present the CPU local timer control logic is
+			    lost on state entry, otherwise it is retained.
+
+	- entry-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency in
+			    microseconds required to enter the idle state.
+			    The exit-latency-us duration may be guaranteed
+			    only after entry-latency-us has passed.
+
+	- exit-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing minimum residency duration
+			    in microseconds, inclusive of preparation and
+			    entry, for this idle state to be considered
+			    worthwhile energy wise (refer to section 2 of
+			    this document for a complete description).
+
+	- wakeup-latency-us:
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing maximum delay between the
+			    signaling of a wake-up event and the CPU being
+			    able to execute normal code again. If omitted,
+			    this is assumed to be equal to:
+
+				entry-latency-us + exit-latency-us
+
+			    It is important to supply this value on systems
+			    where the duration of PREP phase (see diagram 1,
+			    section 2) is non-neglibigle.
+			    In such systems entry-latency-us + exit-latency-us
+			    will exceed wakeup-latency-us by this duration.
+
+	In addition to the properties listed above, a state node may require
+	additional properties specifics to the entry-method defined in the
+	idle-states node, please refer to the entry-method bindings
+	documentation for properties definitions.
+
+===========================================
+4 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system, PSCI enable-method):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU5: cpu@10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU6: cpu@10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU7: cpu@10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU8: cpu@100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU9: cpu@100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU10: cpu@100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU11: cpu@100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU12: cpu@100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU13: cpu@100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU14: cpu@100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU15: cpu@100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_RETENTION_0_0: cpu-retention-0-0 {
+			compatible = "arm,idle-state";
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <80>;
+		};
+
+		CLUSTER_RETENTION_0: cluster-retention-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <250>;
+			wakeup-latency-us = <130>;
+		};
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <250>;
+			exit-latency-us = <500>;
+			min-residency-us = <950>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <600>;
+			exit-latency-us = <1100>;
+			min-residency-us = <2700>;
+			wakeup-latency-us = <1500>;
+		};
+
+		CPU_RETENTION_1_0: cpu-retention-1-0 {
+			compatible = "arm,idle-state";
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <90>;
+		};
+
+		CLUSTER_RETENTION_1: cluster-retention-1 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <270>;
+			wakeup-latency-us = <100>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <70>;
+			exit-latency-us = <100>;
+			min-residency-us = <300>;
+			wakeup-latency-us = <150>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1200>;
+			min-residency-us = <3500>;
+			wakeup-latency-us = <1300>;
+		};
+	};
+
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU5: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU6: cpu@102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU7: cpu@103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	idle-states {
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <200>;
+			exit-latency-us = <100>;
+			min-residency-us = <400>;
+			wakeup-latency-us = <250>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <500>;
+			exit-latency-us = <1500>;
+			min-residency-us = <2500>;
+			wakeup-latency-us = <1700>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <300>;
+			exit-latency-us = <500>;
+			min-residency-us = <900>;
+			wakeup-latency-us = <600>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <800>;
+			exit-latency-us = <2000>;
+			min-residency-us = <6500>;
+			wakeup-latency-us = <2300>;
+		};
+	};
+
+};
+
+===========================================
+5 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - CPUs bindings
+    Documentation/devicetree/bindings/arm/cpus.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp
+
+[4] ARM Architecture Reference Manuals
+    http://infocenter.arm.com/help/index.jsp
+
+[5] ePAPR standard
+    https://www.power.org/documentation/epapr-version-1-1/
diff --git a/Documentation/devicetree/bindings/arm/psci.txt b/Documentation/devicetree/bindings/arm/psci.txt
index b4a58f3..5aa40ed 100644
--- a/Documentation/devicetree/bindings/arm/psci.txt
+++ b/Documentation/devicetree/bindings/arm/psci.txt
@@ -50,6 +50,16 @@  Main node optional properties:
 
  - migrate       : Function ID for MIGRATE operation
 
+Device tree nodes that require usage of PSCI CPU_SUSPEND function (ie idle
+state nodes, as per bindings in [1]) must specify the following properties:
+
+- arm,psci-suspend-param
+		Usage: Required for state nodes[1] if the corresponding
+                       idle-states node entry-method property is set
+                       to "psci".
+		Value type: <u32>
+		Definition: power_state parameter to pass to the PSCI
+			    suspend call.
 
 Example:
 
@@ -64,7 +74,6 @@  Case 1: PSCI v0.1 only.
 		migrate		= <0x95c10003>;
 	};
 
-
 Case 2: PSCI v0.2 only
 
 	psci {
@@ -88,3 +97,6 @@  Case 3: PSCI v0.2 and PSCI v0.1.
 
 		...
 	};
+
+[1] Kernel documentation - ARM idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt