Power Management for New Platforms

Introduction

This page gathers tips, links and documentation relevant to developing power management support for new platforms. Existing platforms can also make use of this document, but the approach taken assumes that little platform support exists at present.

Prerequisites

The platform needs basic support merged upstream already. This document builds on that basic support. For a great overview of these prerequisites for a new platform see Thomas Petazzoni's talk from ELC 2013.

Common Clk Framework

Documentation and presentations

Examples

The Sunxi platform has clk drivers as well as DeviceTree bindings for it's clocks.

link1 link2 link3

Future Discussion about DVFS

Voltage Regulators

Generic Power Domains

presentation

Runtime PM

Introduction to runtime PM talk, by Kevin Hilman: http://people.linaro.org/~khilman/runtime_PM.html

Step 1. Implement pltaform-specific runtime PM core

Simple starting point:

PM core has generic implementation of clock-based runtime PM. Platform code simply has to define which clocks.

Example: mach-davinci/pm_domain.c

Step 2. Convert drivers to runtime PM

See "Intro to runtime PM talk"

If drivers are already using clock API for PM, conversion is trivial:

- clk_enable() --> pm_runtime_get_sync() - clk_disable() --> pm_runtime_put_sync()

Suspend/resume

CPUidle

CPUidle is a framework divided into two parts, the generic part and the arch dependant part. The latter is the one you should focus on if you want to integrate this feature for your board.

If you plan to upstream the driver, keep in mind there is currently a work in progress to unify the different ARM drivers. In order to follow the same path, it is highly recommended to:

  • *not* copy and paste the code from another driver
  • look at making the code consistent with other drivers
  • split the pm code from the driver and put the driver in the drivers/cpuidle directory
  • read LCA2012 presentation

The efficiency of the cpuidle driver depends on how the hardware supports the PM for the CPU and how the system behaves generally from an interrupt perspective.

The Wake source documentation and the "Who disturbs my slumber ?" presentation will help to understand how the cpuidle driver is behaving regarding the interrupts.

Also, the ARM architecture supports are multicluster and the driver must ensure, under some circumstances, to sync the CPUs to enter the same idle state. The cpuidle on multicluster presentation may help to understand these constraints.

Drivers

The driver should be as simple as possible. The main problem is the hackish code which tries to handle dependency from the driver itself.

You should identify the different states the driver should handle and find the dependencies between the states and the power domains the peripherals depend on. The pm_runtime must be used with the peripherals and the generic power domain should attach their power state to the cpuidle state.

The cpuidle core code will ensure a state won't be entered if a peripheral in use depends on its power domain. Please refer to the thread introducing the concept and the example.

Governors

There are currently two governors: menu and ladder. When we want an aggressive power management policy, the tickless system is enabled in the kernel and the menu governor is the best suited to work with. For server, periodic tick and the ladder governor are the best choice for performance and responsiveness.

The menu governor will rely on the next event planned for the system and some heuristics based on statistics of such event. It will then choose the deepest idle state regarding the exit latency and the target residency under the amount of time the next planned event is.

CPUfreq

Drivers

Lots of cpufreq drivers are present in drivers/cpufreq directory which can be referred to develop your platform's driver. However, it is highly encouraged to use the existing generic drivers like: cpufreq-cpu0 (For Any SMP platform) and arm_big_little (For Any ARM big LITTLE platform).

Governors

Several generic cpufreq governors like: Ondemand, Conservative, Powersave, Performance and User space are already mainlined and are easy to use. Android's Interactive governor is not yet mainlined and is present in AOSP releases only. It is not Advised to write a governor for individual platforms unless there is a real requirement which can't be satisfied with existing infrastructure.

Devfreq

Drivers

Governors

Thermal

In linux various thermal zones and cooling devices are exposed under /sys/class/thermal. Thermal zone (sensor) should define operations for temperature, trip point and cooling device binding operations. e.g. for omap4 bandgap

static struct thermal_zone_device_ops ti_thermal_ops = {

  • get_temp = ti_thermal_get_temp,
  • bind = ti_thermal_bind,
  • unbind = ti_thermal_unbind,
  • get_mode = ti_thermal_get_mode,
  • get_trip_type = ti_thermal_get_trip_type,
  • .....

};

First cooling device is registered and saved in the platform data, and then thermal zone is registerd; after this thermal ops bind() is called to associate thermal zone with the cooling device for specific trip point.

- bind

  • verify cooling device for the zone
  • thermal_zone_bind_cooling_device (for binding zone and cooling device for a specific trip point)

cooling device defines operations for set/get cooling states, which will be selected based on the governor logic. e.g. for cpufreq method of cooling below is the operation structure

static struct thermal_cooling_device_ops const cpufreq_cooling_ops = {

  • get_max_state = cpufreq_get_max_state,
  • get_cur_state = cpufreq_get_cur_state,
  • set_cur_state = cpufreq_set_cur_state,

};

Once we get a interrupt from the sensor, temperature is read using the get_temp(); and for the correspoding trip point type and temp is checked using the corresponding thermal zone operation functions. Depending on the governor used, these parameters are used to throttle the cooling device.

WorkingGroups/PowerManagement/Doc/PowerManagementForNewPlatforms (last modified 2013-08-29 07:23:53)