========================================== ARM idle states binding description ========================================== ========================================== 1 - Introduction ========================================== ARM systems contain HW capable of managing power consumption dynamically, where cores can be put in different low-power states (ranging from simple wfi to power gating) according to OS PM policies. The CPU states representing the range of dynamic idle states that a processor can enter at run-time, can be specified through device tree bindings representing the parameters required to enter/exit specific idle states on a given processor. According to the Server Base System Architecture document (SBSA, [3]), the power states an ARM CPU can be put into are identified by the following list: - Running - Idle_standby - Idle_retention - Sleep - Off The power states described in the SBSA document define the basic CPU states on top of which ARM platforms implement power management schemes that allow an OS PM implementation to put the processor in different idle states (which include states listed above; "off" state is not an idle state since it does not have wake-up capabilities, hence it is not considered in this document). Idle state parameters (eg entry latency) are platform specific and need to be characterized with bindings that provide the required information to OS PM code so that it can build the required tables and use them at runtime. The device tree binding definition for ARM idle states is the subject of this document. =========================================== 2 - idle-states definitions =========================================== Idle states are characterized for a specific system through a set of timing and energy related properties, that underline the HW behaviour triggered upon idle states entry and exit. The following diagram depicts the CPU execution phases and related timing properties required to enter and exit an idle state: ..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__.. | | | | | |<------ entry ------->| | latency | |<- exit ->| | latency | |<-------- min-residency -------->| |<------- wakeup-latency ------->| Diagram 1: CPU idle state execution phases EXEC: Normal CPU execution. PREP: Preparation phase before committing the hardware to idle mode like cache flushing. This is abortable on pending wake-up event conditions. The abort latency is assumed to be negligible (i.e. less than the ENTRY + EXIT duration). If aborted, CPU goes back to EXEC. This phase is optional. If not abortable, this should be included in the ENTRY phase instead. ENTRY: The hardware is committed to idle mode. This period must run to completion up to IDLE before anything else can happen. IDLE: This is the actual energy-saving idle period. This may last between 0 and infinite time, until a wake-up event occurs. EXIT: Period during which the CPU is brought back to operational mode (EXEC). entry-latency: Worst case latency required to enter the idle state. The exit-latency may be guaranteed only after entry-latency has passed. min-residency: Minimum period, including preparation and entry, for a given idle state to be worthwhile energywise. wakeup-latency: Maximum delay between the signaling of a wake-up event and the CPU being able to execute normal code again. If not specified, this is assumed to be entry-latency + exit-latency. These timing parameters can be used by an OS in different circumstances. An idle CPU requires the expected min-residency time to select the most appropriate idle state based on the expected expiry time of the next IRQ (ie wake-up) that causes the CPU to return to the EXEC phase. An operating system scheduler may need to compute the shortest wake-up delay for CPUs in the system by detecting how long will it take to get a CPU out of an idle state, eg: wakeup-delay = exit-latency + max(entry-latency - (now - entry-timestamp), 0) In other words, the scheduler can make its scheduling decision by selecting (eg waking-up) the CPU with the shortest wake-up latency. The wake-up latency must take into account the entry latency if that period has not expired. The abortable nature of the PREP period can be ignored if it cannot be relied upon (e.g. the PREP deadline may occur much sooner than the worst case since it depends on the CPU operating conditions, ie caches state). An OS has to reliably probe the wakeup-latency since some devices can enforce latency constraints guarantees to work properly, so the OS has to detect the worst case wake-up latency it can incur if a CPU is allowed to enter an idle state, and possibly to prevent that to guarantee reliable device functioning. The min-residency time parameter deserves further explanation since it is expressed in time units but must factor in energy consumption coefficients. The energy consumption of a cpu when it enters a power state can be roughly characterised by the following graph: | | | e | n | /--- e | /------ r | /------ g | /----- y | /------ | ---- | /| | / | | / | | / | | / | | / | |/ | -----|-------+---------------------------------- 0| 1 time(ms) Graph 1: Energy vs time example The graph is split in two parts delimited by time 1ms on the X-axis. The graph curve with X-axis values = { x | 0 < x < 1ms } has a steep slope and denotes the energy costs incurred while entering and leaving the idle state. The graph curve in the area delimited by X-axis values = {x | x > 1ms } has shallower slope and essentially represents the energy consumption of the idle state. min-residency is defined for a given idle state as the minimum expected residency time for a state (inclusive of preparation and entry) after which choosing that state become the most energy efficient option. A good way to visualise this, is by taking the same graph above and comparing some states energy consumptions plots. For sake of simplicity, let's consider a system with two idle states IDLE1, and IDLE2: | | | | /-- IDLE1 e | /--- n | /---- e | /--- r | /-----/--------- IDLE2 g | /-------/--------- y | ------------ /---| | / /---- | | / /--- | | / /---- | | / /--- | | --- | | / | | / | |/ | time ---/----------------------------+------------------------ |IDLE1-energy < IDLE2-energy | IDLE2-energy < IDLE1-energy | IDLE2-min-residency Graph 2: idle states min-residency example In graph 2 above, that takes into account idle states entry/exit energy costs, it is clear that if the idle state residency time (ie time till next wake-up IRQ) is less than IDLE2-min-residency, IDLE1 is the better idle state choice energywise. This is mainly down to the fact that IDLE1 entry/exit energy costs are lower than IDLE2. However, the lower power consumption (ie shallower energy curve slope) of idle state IDLE2 implies that after a suitable time, IDLE2 becomes more energy efficient. The time at which IDLE2 becomes more energy efficient than IDLE1 (and other shallower states in a system with multiple idle states) is defined IDLE2-min-residency and corresponds to the time when energy consumption of IDLE1 and IDLE2 states breaks even. The definitions provided in this section underpin the idle states properties specification that is the subject of the following sections. =========================================== 3 - idle-states node =========================================== ARM processor idle states are defined within the idle-states node, which is a direct child of the cpus node [1] and provides a container where the processor idle states, defined as device tree nodes, are listed. - idle-states node Usage: Optional - On ARM systems, it is a container of processor idle states nodes. If the system does not provide CPU power management capabilities or the processor just supports idle_standby an idle-states node is not required. Description: idle-states node is a container node, where its subnodes describe the CPU idle states. Node name must be "idle-states". The idle-states node's parent node must be the cpus node. The idle-states node's child nodes can be: - one or more state nodes Any other configuration is considered invalid. An idle-states node defines the following properties: - entry-method Value type: Usage and definition depend on ARM architecture version. # On ARM v8 64-bit this property is required and must be: - "psci" # On ARM 32-bit systems this property is optional The nodes describing the idle states (state) can only be defined within the idle-states node, any other configuration is considered invalid and therefore must be ignored. =========================================== 4 - state node =========================================== A state node represents an idle state description and must be defined as follows: - state node Description: must be child of the idle-states node The state node name shall follow standard device tree naming rules ([5], 2.2.1 "Node names"), in particular state nodes which are siblings within a single common parent must be given a unique name. The idle state entered by executing the wfi instruction (idle_standby SBSA,[3][4]) is considered standard on all ARM platforms and therefore must not be listed. With the definitions provided above, the following list represents the valid properties for a state node: - compatible Usage: Required Value type: Definition: Must be "arm,idle-state". - local-timer-stop Usage: See definition Value type: Definition: if present the CPU local timer control logic is lost on state entry, otherwise it is retained. - entry-latency-us Usage: Required Value type: Definition: u32 value representing worst case latency in microseconds required to enter the idle state. The exit-latency-us duration may be guaranteed only after entry-latency-us has passed. - exit-latency-us Usage: Required Value type: Definition: u32 value representing worst case latency in microseconds required to exit the idle state. - min-residency-us Usage: Required Value type: Definition: u32 value representing minimum residency duration in microseconds, inclusive of preparation and entry, for this idle state to be considered worthwhile energy wise (refer to section 2 of this document for a complete description). - wakeup-latency-us: Usage: Optional Value type: Definition: u32 value representing maximum delay between the signaling of a wake-up event and the CPU being able to execute normal code again. If omitted, this is assumed to be equal to: entry-latency-us + exit-latency-us It is important to supply this value on systems where the duration of PREP phase (see diagram 1, section 2) is non-neglibigle. In such systems entry-latency-us + exit-latency-us will exceed wakeup-latency-us by this duration. - status: Usage: Optional Value type: Definition: A standard device tree property [5] that indicates the operational status of an idle-state. If present, it shall be: "okay": to indicate that the idle state is operational. "disabled": to indicate that the idle state has been disabled in firmware so it is not operational. If the property is not present the idle-state must be considered operational. - idle-state-name: Usage: Optional Value type: Definition: A string used as a descriptive name for the idle state. In addition to the properties listed above, a state node may require additional properties specifics to the entry-method defined in the idle-states node, please refer to the entry-method bindings documentation for properties definitions. =========================================== 4 - Examples =========================================== Example 1 (ARM 64-bit, 16-cpu system, PSCI enable-method): cpus { #size-cells = <0>; #address-cells = <2>; CPU0: cpu@0 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x0>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU1: cpu@1 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x1>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU2: cpu@100 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x100>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU3: cpu@101 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x101>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU4: cpu@10000 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x10000>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU5: cpu@10001 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x10001>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU6: cpu@10100 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x10100>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU7: cpu@10101 { device_type = "cpu"; compatible = "arm,cortex-a57"; reg = <0x0 0x10101>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; }; CPU8: cpu@100000000 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x0>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU9: cpu@100000001 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x1>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU10: cpu@100000100 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x100>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU11: cpu@100000101 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x101>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU12: cpu@100010000 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x10000>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU13: cpu@100010001 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x10001>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU14: cpu@100010100 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x10100>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; CPU15: cpu@100010101 { device_type = "cpu"; compatible = "arm,cortex-a53"; reg = <0x1 0x10101>; enable-method = "psci"; cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; }; idle-states { entry-method = "psci"; CPU_RETENTION_0_0: cpu-retention-0-0 { compatible = "arm,idle-state"; arm,psci-suspend-param = <0x0010000>; entry-latency-us = <20>; exit-latency-us = <40>; min-residency-us = <80>; }; CLUSTER_RETENTION_0: cluster-retention-0 { compatible = "arm,idle-state"; local-timer-stop; arm,psci-suspend-param = <0x1010000>; entry-latency-us = <50>; exit-latency-us = <100>; min-residency-us = <250>; wakeup-latency-us = <130>; }; CPU_SLEEP_0_0: cpu-sleep-0-0 { compatible = "arm,idle-state"; local-timer-stop; arm,psci-suspend-param = <0x0010000>; entry-latency-us = <250>; exit-latency-us = <500>; min-residency-us = <950>; }; CLUSTER_SLEEP_0: cluster-sleep-0 { compatible = "arm,idle-state"; local-timer-stop; arm,psci-suspend-param = <0x1010000>; entry-latency-us = <600>; exit-latency-us = <1100>; min-residency-us = <2700>; wakeup-latency-us = <1500>; }; CPU_RETENTION_1_0: cpu-retention-1-0 { compatible = "arm,idle-state"; arm,psci-suspend-param = <0x0010000>; entry-latency-us = <20>; exit-latency-us = <40>; min-residency-us = <90>; }; CLUSTER_RETENTION_1: cluster-retention-1 { compatible = "arm,idle-state"; local-timer-stop; arm,psci-suspend-param = <0x1010000>; entry-latency-us = <50>; exit-latency-us = <100>; min-residency-us = <270>; wakeup-latency-us = <100>; }; CPU_SLEEP_1_0: cpu-sleep-1-0 { compatible = "arm,idle-state"; local-timer-stop; arm,psci-suspend-param = <0x0010000>; entry-latency-us = <70>; exit-latency-us = <100>; min-residency-us = <300>; wakeup-latency-us = <150>; }; CLUSTER_SLEEP_1: cluster-sleep-1 { compatible = "arm,idle-state"; local-timer-stop; arm,psci-suspend-param = <0x1010000>; entry-latency-us = <500>; exit-latency-us = <1200>; min-residency-us = <3500>; wakeup-latency-us = <1300>; }; }; }; Example 2 (ARM 32-bit, 8-cpu system, two clusters): cpus { #size-cells = <0>; #address-cells = <1>; CPU0: cpu@0 { device_type = "cpu"; compatible = "arm,cortex-a15"; reg = <0x0>; cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; }; CPU1: cpu@1 { device_type = "cpu"; compatible = "arm,cortex-a15"; reg = <0x1>; cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; }; CPU2: cpu@2 { device_type = "cpu"; compatible = "arm,cortex-a15"; reg = <0x2>; cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; }; CPU3: cpu@3 { device_type = "cpu"; compatible = "arm,cortex-a15"; reg = <0x3>; cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; }; CPU4: cpu@100 { device_type = "cpu"; compatible = "arm,cortex-a7"; reg = <0x100>; cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; }; CPU5: cpu@101 { device_type = "cpu"; compatible = "arm,cortex-a7"; reg = <0x101>; cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; }; CPU6: cpu@102 { device_type = "cpu"; compatible = "arm,cortex-a7"; reg = <0x102>; cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; }; CPU7: cpu@103 { device_type = "cpu"; compatible = "arm,cortex-a7"; reg = <0x103>; cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; }; idle-states { CPU_SLEEP_0_0: cpu-sleep-0-0 { compatible = "arm,idle-state"; local-timer-stop; entry-latency-us = <200>; exit-latency-us = <100>; min-residency-us = <400>; wakeup-latency-us = <250>; }; CLUSTER_SLEEP_0: cluster-sleep-0 { compatible = "arm,idle-state"; local-timer-stop; entry-latency-us = <500>; exit-latency-us = <1500>; min-residency-us = <2500>; wakeup-latency-us = <1700>; }; CPU_SLEEP_1_0: cpu-sleep-1-0 { compatible = "arm,idle-state"; local-timer-stop; entry-latency-us = <300>; exit-latency-us = <500>; min-residency-us = <900>; wakeup-latency-us = <600>; }; CLUSTER_SLEEP_1: cluster-sleep-1 { compatible = "arm,idle-state"; local-timer-stop; entry-latency-us = <800>; exit-latency-us = <2000>; min-residency-us = <6500>; wakeup-latency-us = <2300>; }; }; }; =========================================== 5 - References =========================================== [1] ARM Linux Kernel documentation - CPUs bindings Documentation/devicetree/bindings/arm/cpus.yaml [2] ARM Linux Kernel documentation - PSCI bindings Documentation/devicetree/bindings/arm/psci.txt [3] ARM Server Base System Architecture (SBSA) http://infocenter.arm.com/help/index.jsp [4] ARM Architecture Reference Manuals http://infocenter.arm.com/help/index.jsp [5] Devicetree Specification https://www.devicetree.org/specifications/