About CPUIdle

Various CPUs today support multiple idle levels that are differentiated by varying exit latencies and power consumption during idle. CPUIdle is a generic in-kernel infrastructure that separates idle policy (governor) from idle mechanism (driver) and provides a standardized infrastructure to support independent development of governors and drivers. CPUIdle governor is a policy routine that decides what idle state to enter at any given time.

Interfaces to be implemented by ARM platforms

  • platform-specific cpuidle driver for SoC
  • ftrace tracepoints for latency instrumentation code

Linaro changes

CPUIdle Latency Instrumentation

System can enter different idle states in idle and each idle state has energy cost and latencies associated with it. Goal of this activity is to find out the latencies associated with each of the idle states (C states) and find out the optimum set of idle states for a given system. This will help us to reduce total power consumption at the system level.

We need to find the total sleep time and total wake up time in the idle path. Together, this gives latency for a given C state. We need to track the idle time spent in each C state too. Then total accumulated energy for a given C state can be calculated as:

 Total Accumulated energy = (idle_time * Cx_energy) + (sleep_time * Cx_sleep_energy) + (wkup_time * Cx_wakeup_energy) 

(energy consumed during sleep transition, wkup transition and idle time are different)

Once we have the accumulated energy cost of each C states, then we can find the set of C states with minimal energy consumption.

Feature Status

  • Status: original implementation in C code rejected in favour of more-scalable ftrace-based traces that can be analysed offline - In progress

Implementation

  1. Find the latency for each defined C state
    1. Find SW Latency (sleep and wake up)
      1. Identify Profile points in the code for calculating CPU Idle delay
      2. Implement ftrace instrumentation in all the profile points
      3. Add sysfs entries to display C state latencies
      4. Modify Powertop to display latency information
    2. Find HW Latency (sleep and wake up)
      1. Get the data from HW Team
  2. Optimize the number of C states (based on energy required for each C state)
    1. Calculate Accumulated energy spent in each C state at different point of time.
    2. Find the C states with minimal total Energy

Future Work

ARM idle state -> C-state mapping

Adopt consistent language when mapping ARM idle states (across various SoCs) to C-states

Feature Status

  • Status: On hold. Current thought is that there are too many platform specific details between ARM SoCs to come up with one common set of C states.

Governor improvements for realtime usecases

Improve governor behaviour in realtime usecases to get rid of usecase-specific C-state constraints being set

Feature Status

  • Status: In order to find the issues in menu governor some tests are performed and results shown below,

Test case description to measure and analyse the real time behaviour of menu governor

  • To simulate the real time testing, a random data transfer from Linux host PC to the USB mass storage device on target board is performed. The Board provides 5 C states with exit latency as 1,300,1000,10000,40000usec. The test program to be executed on the host PC should be executed as ./test_usb_cpuidle [mass storage device node] [bytes of data to be transferred] [total itertaions] e.g ./test_usb_cpuidle /dev/sdb 5 10000. While the data transfer is happening between host and target board, the powertop tool is executed in parallel in the target board.

The test program is available here: test_usb_cpuidle.c.

Following 2 powertop measurements is shown below,

  • case 1)No data transfer is happening,

C state

Avg residency

C0 Run

0.2%

C0

3.2%

C1

0.0%

C2

0.0%

C3

0.0%

C4

96.6%

Wake Up per sec=4.1

  • case 2)Data transfer is happening,

C state

Avg residency

C0 Run

0.7%

C0

98.2%

C1

0.0%

C2

0.0%

C3

0.0%

C4

1.1%

Wake Up per sec=79.7

The above tests is performed on samsung Orion board. Simliar results are observed in Omap zoom3 board and ericssion U8500 platform. The results shows that menu governor is able to scale up to low latency C states in case of usb transfer use case.

Platform Cpuidle Driver Basic Consolidation

During the attempted upstreaming of a common i.MX cpuidle driver, Russell King noted that there was certain code being duplicated between various ARM SoC cpuidle drivers and that this would no longer be accepted. This commonly duplicated functionality could be moved to the core cpuidle layer.

Feature Status

Platform Cpuidle Driver Init Consolidation

It appears that there is an opportunity to consolidate much of the cpuidle init functionality that is currently being implemented by each platform or architecture.

Feature Status

* Cpuidle init consolidation was being included in the basic cpuidle consolidation patch series but was removed after v3. Work on this init consolidation is ongoing by Rob Lee and the current plans are to make a new submission before end of Q1 2012.

Re-add cpuidle device specific states

A few months ago, the set of cpuidle states was implemented per cpuidle device but was moved per cpuidle driver which saved some memory. But systems with homogeneous CPUs will need device specific states. So an implementation to re-add device specific states (perhaps by pointer) need to be added.

Feature Status

* Not yet started. Possible target for Q2 2012.

Re-add cpuidle prepare callback

A few months ago, a prepare callback function was removed as no platform was using it. But there has been a request by Mike Turquette to re-add this functionality as OMAP systems can make use of it.

Feature Status

* Not yet started.

Track both attempted and successful enter attempts

Currently (up to and including v3.3-rc5), cpuidle tracks the number of iterations and the time spent in each cpuidle state. A request was made by Mike Turquette and John Hunter to perform additional data collection:

"... Jon suggested that the 'usage' statistics that are reported in sysfs should also reflect failed versus successful C-state transitions, which is a great idea.  This could simply be achieved by renaming the current 'usage' count to something like 'transitions_attempted' and then conditionally increment a new counter within the 'if (entered_state >= 0)' block, perhaps named, 'transition_succeeded'.

This way the old 'usage' count paradigm is as accurate as the new time-keeping code.  Being able to easily observe which C-state tend to fail the most would be invaluable in tuning idle policy for maximum effectiveness."

Feature Status

* Not yet started.

ARM-specific cpuidle governor?

When the last CPU is turned off, it may also be valid to turn off the SCU1 and L2CC2. In addition to latency requirements, the cluster can only be turned off if all devices on the ACP3 are dormant

Feature Status

  • Status: TBD

Coupled Cpuidle State Support

SMP systems with coupled CPU idle states require additional handling and no common solution exist today. Colin Cross has submitted a patch to address this need:

http://comments.gmane.org/gmane.linux.ports.tegra/2649

Feature Status

  • Status: Patch is not yet accepted but some discussion has been taken place on patch thread. More systems to be tested when possible.


  1. Snoop Control Unit: transparently keeps data shared across processors coherent (1)

  2. Level 2 Cache Controller (2)

  3. Accelerator Coherency Port: allows an external bus master to access the same physical memory view as the processor cluster (3)

WorkingGroups/PowerManagement/Archives/CPUIdle (last modified 2013-08-21 11:12:32)