With big.LITTLE offering the best of both worlds - performance and power-savings, as-needed, the Linux scheduler needs to be enhanced to utilise this capability. The discussions at the Scheduler mini-summit point to a long-term effort to make the Linux scheduler more power-aware. This work should be generally useful to heterogenous systems.
The goal is to provide a usable mechanism that reliably allows all work to be moved off of a CPU so that CPU can be powered off and back on under user-application control.
This specification outlines the efforts happening inside Linaro to help the community come up with a good solution for HMP scheduling.
- Make it easy for scheduler developers to work with big.LITTLE (by emulation on current x86 hardware)
- Make it easy for scheduler developers to test against typical mobile workloads
- Create a HOWTO/white paper (for medium term use) on how to use current tools/techniques on big.LITTLE systems
- Identify issues preventing the CPU from reaching a quiescent state (e.g. timers, workqueues)
- Review and test Paul Turner's per-task load tracking patchset and help it get integrated into mainline: Vincent
- Provide best practices for using existing facilities (cgroups, cpusets, affinity, etc.) to adapt workloads to big.LITTLE systems: Vincent
- Provide an easy way to emulate a big.LITTLE system on x86 workstations for everyone to test workloads on. Although this is not a substitute for real hardware, it should enable substantial software experimentation and development. There are severals ways to achieve this currently, figure out the best way: Amit K
- Produce synthetic test load(s) for typical smartphone use cases: Dmitry
- Easily used by kernel hackers (must not require an Android setup, preferably a C program or sh/perl/python script)
- Must produce useful figures of merit:
- User experience (e.g., response time).
- Proxy for energy consumption, for example idle residency or DVFS state residency.
- A sched-top tool to analyze scheduler traces and statistics.
Need a interactivity test better than cyclic test. Possibly build on Con Kolivas's work. [PJT & APZ]
A sched-top tool to analyze scheduler traces and statistics.
- Accumulate a list of the attributes that SoC vendors believe to be important to include in the scheduling decision: Vincent
The currently list contains:
Power-domain and clock-domain constraints. For example, many ARM SoCs require that all CPUs in a cluster run at the same clock rate.
- Thermal feedback and tradeoffs
- Process type
- Relative benefit of reducing frequency of several CPUs as opposed to consolidating workload on a small number of CPUs.
- Instruction-per-clock (IPC) measurements and correlation between clock rate and useful forward progress.
- Improving quiescence of CPUs:
- Create some way to allow a single userspace operation to evacuate processes, timers, and irqs from a specified CPU, which then goes idle. Similarly, there needs to be a single userspace operation to restore a CPU to runnable state.
- Experiments to reduce CPU-hotplug system-wide disturbance:
When a CPU is removed, don't kill its kthreads and don't deallocate its data. This also requires that the CPU-online path check for first bringup of a given CPU, at which point kthreads must be created and data structures must be allocated.
This should greatly reduce CPU-online overhead and latency.
- There needs to be a how-to for "parking" per-CPU kthreads whose CPU is offline. Challenges:
- A wakeup can be delayed, so that it arrives at the kthread after the corresponding CPU has gone offline. Just having the CPU sleep is defeated.
- Many of the per-CPU kthreads are coded assuming that they will always be running on their CPU.
The kthread must sleep interruptibly, otherwise soft-lockup messages will be emitted.
Does "kill -STOP" work?
Wean CPU hotplug from stop_cpu(). This requires review and possible updates to the CPU_DYING notifiers, and possibly other adjustments. [Paul E. McKenney to remove the stop-machine dependencies in RCU's CPU_DYING notifiers.]
This reduces OS jitter and avoids waking idle CPUs.
- Measure the speed of offlining and onlining a CPU. This should be a handful of milliseconds, less than five.
- Remove sched_mc. [Peter Zijlstra]
2012 Q4 - 2013
- See if 3D gaming engines can make good use of big.LITTLE, even in the presence of thermal throttling.
- Investigate alternative scheduler disciplines e.g. SCHED_EDF.
- Investigate modal scheduling. Paul Turner gave the following as an example:
- Low load of interactive, low-utilization tasks might favor race to idle.
- Moderate load of periodic media-feeding tasks might lower frequency to the smallest value that allows the task to keep up with its hardware.
- High load of CPU-bound tasks in the absence of thermal limitations might increase frequency.
WorkingGroups/PowerManagement/Archives/OldSpecs/PowerAwareScheduling (last modified 2013-08-22 10:02:01)