The Linaro Android Build Service (aka linaro-cloud-buildd)

The Linaro Android Build Service allows Linaro engineers to build Android in various configurations in the Cloud.

1. User Guide

1.1. Basics

The Linaro Android Build Service lives at https://android-build.linaro.org (ignore the self-signed certificate for now).

It consists of a number of jobs. Each job defines source and how to build it (see the configurations section), and has an owner and a name.

1.2. Getting Access

To be able to set up jobs and request builds, you need to be a member of the ~linaro-android-builders team on Launchpad.

1.3. Build Configurations

A "build configuration" defines some source (accessed through Google's repo tool) and how to build it. Its syntax is pair of variable=value assignments, following shell syntax. Besides being interpreted by build scripts as described below, all configuration variables are also exported to the environment, so they may affect build procedure for particular target (for good or for bad, please make sure you don't select variable names which clash with well-known variables used by make or autotools).

Example of a build config:

BUILD_TYPE=build-android
MANIFEST_REPO=git://android.git.linaro.org/platform/manifest.git
MANIFEST_BRANCH=linaro_android_2.3.3
TARGET_PRODUCT=beagleboard

The build tool itself only understands single variable:

  • BUILD_TYPE: Defines type of build. For each type, there's a script from the build-tools which handle the build (previously, this variable was called SCRIPT_NAME, such name is now deprecated). The current choices are:

    • build-android - build Android target images (default)

    • build-android-restricted - build Android targets requiring restricted-access code (build usually should be owned by restricted group either)

    • build-android-toolchain - build upstream Android toolchain (alpha)

    • build-android-toolchain-linaro - build Linaro Android toolchain

All other variables are interpreted by specific build script, and their semantics depends on it. However, there are some conventions which all scripts dealing with particular feature should follow:

  • MANIFEST_REPO: The location of the manifest repository (passed to repo init's -u option). Defaults to git://android.git.kernel.org/platform/manifest.git. Almost all jobs will change this.

  • MANIFEST_BRANCH: The branch within the manifest repository to check out (passed to repo init's -b option). Defaults to master. Most jobs change this.

  • MANIFEST_FILENAME: The name of the initial manifest file (passed repo init's -m option). Defaults to default.xml. Very few jobs change this.

  • REPO_QUIET: Non-empty value ('true') will make repo much less verbose in build logs.

  • MAKE_TARGETS: For build scripts invoking make, this will override make targets used by the script by default.

  • MAKE_JOBS: For build scripts invoking make, this overrides value used for -j option (number of concurrent jobs).

  • RAMDISK_SIZE: Use to override build ramdisk size (value 0 (disable) is useful in case of ramdisk overflow issues).

  • EXTERNAL_TARBALL: Use to request that the build system fetch an archive from the location that you set EXTERNAL_TARBALL to and unpack it into build/external_tarballs (build is the build root directory. This exact path can be written as $BUILD_SCRIPT_ROOT/../../build/external_tarballs). Multiple archives can be requested, separated by ";", e.g. EXTERNAL_TARBALL="http://foo.com/ball1.tar.gz;http://foo.com/ball2.tar.bz2"

  • SOURCE_OVERLAY: Use to request that the build system fetch an archive from authorized overlay storage area (http://snapshots.linaro.org/android/binaries/) and unpack it to Android source tree after "repo sync" command finished. This can be used for example to add vendor components. Some archived stored in overlay area are provided under proprietary license, which requires special handling in build. To use such archive in a build:

    • Open URL for an archive in a fresh/anonymous browser session, you should be provided with license text.
    • Carefully read the license and make sure that the build you create abides by its terms (special attention should be paid to the redistribution of build results - oftentimes, same, or similar license should be accepted by 3rd parties before downloading them).
    • Once you are sure that intended build redistribution and usage abide by license terms, open HTML source code for license page and look up hexadecimal hash ID for the license (should be close to bottom of the page).
    • Add to the build config following variable: SOURCE_OVERLAY_ACCEPT_LICENSE=<hash>

1.3.1. Build variables for BUILD_TYPE=build-android

The following options are interpreted by the build-android script:

  • TOOLCHAIN_URL: The location of a pre-built binary tarball containing a toolchain to download and use for building (XXX document this a bit).

  • TARGET_PRODUCT: The device to target, e.g. beagleboard, panda, ...

  • TARGET_TOOLS_PREFIX: Override toolchain to use for Android building, e.g. TARGET_TOOLS_PREFIX=prebuilt/linux-x86/toolchain/arm-eabi-4.4.0/bin/arm-eabi- will use gcc-4.4.0 based toolchain. This generally should point to one of AOSP in-tree prebuild toolchains, but may also point to toolchain provided by other package. Note that providing TOOLCHAIN_URL option will automatically set TARGET_TOOLS_PREFIX to that toolchain (if you explicitly set TARGET_TOOLS_PREFIX in this case, it will be ignored).

  • BUILD_FS_IMAGE: If set to 1, filesystem image will be produced in addition to tarballs, ready to be dd'ed to a card for booting. File system size can be set with FS_IMAGE_SIZE (default 2G). The image is produced with linaro-android-media-create from linaro-image-tools, so other parameters, like filesystem(s) type and partition layout, are governed by this tool.

  • As explained above, you may use other options/variables as supported by Android makefiles (they will be exported to the environment).

1.3.2. Build variables for BUILD_TYPE=build-android-toolchain

The following options are interpreted by the build-android-toolchain script:

  • BINUTILS_VERSION - Binutils version to build, among those present in the repository (default 2.20.1)

  • BINUTILS_URL - URL of replacement binutils source tarball to use for toolchain (will also override BINUTILS_VERSION)

  • GCC_VERSION - GCC version to build, among those present in the repository (default 4.4.0)

  • GCC_URL - URL of replacement gcc source tarball to use for toolchain (will also override GCC_VERSION)

  • SYSROOT_NDK_URL - URL of upstream NDK release to extract sysroot from. If not set, bare-metal toolchain is built. Currently, android-9 API level sysroot is used.

The following options are interpreted by the build-android-toolchain-linaro script:

  • GCC_URL - Extended URL of gcc source tarball to use for toolchain. This option is mandatory (as Linaro Android repository doesn't include any gcc source by itself). Value of this option is passed directly as --with-gcc parameter to linaro-build.sh, see its docs for the values it may take.

Note that build-android-toolchain-linaro build doesn't accept GCC_VERSION, BINUTILS_VERSION, BINUTILS_URL params (binutils is fixed at 2.20.1 in particular).

1.3.3. Build variables for LAVA integration

Following build config variables are related to LAVA integration:

  • LAVA_SUBMIT - If value is 1, then after build successfully finishes, submit build artifacts for testing to LAVA

  • LAVA_TEST_PLAN - LAVA test plan(s) to use for testing (optional, default is depends on BUILD_TYPE)

  • LAVA_SUBMIT_FATAL - If 1 (default), failure to submit job to LAVA makes build fail. Otherwise (0), just warning is issued to build log.

  • LAVA_ANDROID_BINARIES - Defines if LAVA is instructed to install binaries into Android images. This defaults to on, but setting it to False/0/off will disable this action.

1.4. The Frontend

Most of the time you will just interact with the frontend you see at https://android-build.linaro.org/. This allows user to view, create, edit, delete and build jobs, in an easy to use way. Members of ~linaro-android-builders Launchpad.net group can create "personal" builds. There are also other groups members of which can set up group-owned builds, which have additional semantics (for example, members ~linaro-android-official-builders can edit and set up 'official' builds).

1.5. Looking Behind the Curtain: Jenkins

There is an install of jenkins at https://android-build.linaro.org/jenkins. This can be used to inspect the state of the system in more detail than in the frontend, and admins can change details.

1.6. Advanced: Editing Job's Description

Jenkins allows to specify a job description, and it also shows on a frontend page for job. This is ideal to provide installation instructions, etc. To edit this description, login to Jenkins as a user with administrative permissions, open job in question, and click "add description"/"edit description" link.

Each job has own description, so you may need to cut&paste the description across the number of jobs with the same BUILD_TYPE. So, the best practice is to keep description short, and link to a common page (e.g. in wiki) which provides details instructions.

1.7. Advanced: Adding a New Build Script

If you want to use the system to build something other than the default android (for example, the android toolchain) you can add a script to build-scripts directory of the build tools. This will be run with bash and could be as simple as:

source "${BUILD_SCRIPT_ROOT}"/helpers

repo-sync-from-mirror "${1}"

make

All the variables from the configuration will be present in the environment.

To get your script into the branch used by the build system, submit a merge proposal to lp:linaro-android-build-tools -- the jobs fetch the tip of this branch for each build.

1.8. Advanced: Debugging Build Process

When making a changes to build configs and/or build scripts, it sometimes challenging to test/debug the changes, due to long turnaround of a typical build (few hours), especially if incremental tweaks are needed. This section describes few tricks to do more streamlined debug passes. Please remember that before deploying changes to all production builds, they still should undergo complete integration testing (i.e. tested on complete build, without short-circuit features described in this section).

1.8.1. Overall process

All testing should happen on a dedicated test build, clone of the build intended to be changed (or just a sample build if changes to be applied globally). In case build scripts need changes, linaro-android-build-tools should be branched, and changes applied to the branch, with Jenkins config of the test build job updated to pull from this branch. Finally, in case build slave configuration is being tested (like addition of new packages), the whole new build slave config may need to be added to Jenkins, and test job updated to select exactly this slave type via "labels" setting.

1.8.2. Cutting build time

To have a normal build process run, but much quicker that usual, following may be added to build config:

MAKE_TARGETS=userdatatarball

This will build only userdatatarball.tar.bz2, and runs on the order of half an hour. Note that results of this build won't pass LAVA testing for example, because they simply lack all needed artifacts.

1.8.3. Short-circuiting build time

Sometimes, what needs testing is not build (compilation) process per se, but integrational aspects, like artifact publishing or LAVA submission. In this case, compilation process can be bypassed completely, leading to much quicker turnaround. In this case, artifacts are taken from another existing builds, and processed further as if they were produced by the current build.

BUILD_COPYCAT=~linaro-android/panda-ics-gcc47-tilt-tracking-blob/100

This will take artifacts (specifically: system.tar.bz2, boot.tar.bz2, userdata.tar.bz2) from build #100 of ~linaro-android/panda-ics-gcc47-tilt-tracking-blob (as hosted on snapshots.linaro.org).

2. Developer Guide

2.1. Guide to the Codebase

There are a few pieces of code that make up the system:

Custom:

3rd-party/distro packages:

  • Jenkins with EC2 plugin
  • git-daemon

2.1.1. The Frontend

This is a Django application, but unlike most Django applications, it's not backed onto a database (apart from session/user stuff) but rather all its state is in Jenkins. There is lots of Javascript, using the YUI 3 libraries. The code layout is a bit awkward perhaps as this is simpler than most Django projects in some sense. But it's small, so it shouldn't take long to find stuff. Running it locally is a bit awkward since it consists of so many moving parts, but the usual apache setup is mimicked in the 'site-demo.tac' twisted web server.

All pages are (almost) static HTML that pull in data by AJAX, mostly by directly going to Jenkins' JSON API. Write operations go through an API exposed by Django (the /api views, see android_build.frontend.api)

Information that isn't in the json API and all write operations go through the Django views that enforce the simple ACLs and make authenticated requests to the Jenkins instance.

There are no tests :/

2.1.2. The Mirror Service

The mirror service is an XML-RPC service that takes a manifest as understood by repo, makes sure that the mirror has up to date copies for each repo mentioned by the manifest and then returns a new manifest that refers to the copies on the mirror.

Implementation wise, the service is written in Twisted. When given a manifest it splits it into a set of manifests to run to update the local copies, one per host, and runs 'repo sync' in /mnt/mirror/$host for each host in parallel, then returns another version of the manifest to the client. There is enough locking to prevent 'repo sync' being run more than once at a time in the same directory.

The code to factor the manifests is well tested, the other parts are not (but are fairly simple).

The error reporting is shockingly absent and many details are hardcoded, but otherwise this works surprisingly well given it only took a couple of days to write.

See the spec at https://wiki.linaro.org/Platform/Android/Specs/AndroidMirrorService for more.

2.1.3. The Build Scripts

There are only a few scripts really, and it's perhaps artificial that they are bundled together.

The scripts in node/ are invoked by jenkins on the build nodes. root-setup-android-build-node is run on instance startup and installs the packages required to build android and creates a user to run the builds as. build runs a build. root-setup-android-toolchain-build-node is not used at this time. If a different build type needs to run different setup as root, then we'll need to think of a way to achieve that.

The scripts in build-scripts are the valid values for BUILD_TYPE in the configuration. They are invoked by node/build.

The scripts in control are designed to be run on the master node. setup-control-node mostly configures a blank system to be the master and mangle-jobs helps manage the Jenkins jobs.

2.2. Production Deployment Details

The persistent services run on the android-build.linaro.org machine (which is itself an EC2 instance). The setup of this machine is 'documented' in the setup-control-node script, which does most of the work in setting up a blank machine to be the master node for the build service.

A few people (lool, asac, mwhudson & pfalcon) have ssh access to the 'ubuntu' account on this machine.

There are a few service accounts:

  • jenkins (runs jenkins)
  • gitdaemon (runs git-daemon)
  • build-system-frontend (owns the code of the frontend & combo loader, although they run as www-data)

  • git-mirror (this user runs the mirror service and owns the /mnt/mirror hierarchy, although the code of the mirror service is in ~ubuntu).

The frontend and the combo loader (for serving YUI3 assets) are run through Appache's mod_wsgi.

~ubuntu/linaro-android-build-tools/ contains a branch of lp:linaro-android-build-tools, which contains several maintenance scripts in control/ .

To deploy code updates, there's control/deploy-control-node script, which installs updates for frontend and mirror services. There's no need to explicitly "deploy" build tools, as they are fetched from bzr for every new build (master copy of build tools on android-build.linaro.org should be bzr-pulled regularly of course, to make sure that deploy-control-node and other scripts are up-to-date). The deploy-control-node deploys codebases into separate directories, tagged with bzr revision/deployment time, and just moves "current" symlink to such directory. This allows for quick rollback to older version(s) in case of contingency. The oldest deployment directories should be removed from time to time manually.

The utils/mangle-jobs/mangle-jobs tool can be useful when you need to change job configs in Jenkins somehow (for example, to update description of all/some jobs). To use it, you need write a small "mangle" script which will be applied to each job config. There're few already written, see *.mangle.

2.3. Development Sandboxes

All prototyping and development should happen on sandboxes, and deployed to the production only after sufficiently tested and approved. There's control/sandbox-create script which completely automates sandbox creation (it starts a new EC2 instance and deploys codebase to it using previously described setup-control-node script).

For developers from other teams to request a sandbox, they should submit a ticket to linaro-android-infrastructure project. There are few important rules to remember about sandbox access:

  • Constantly running sandboxes take both monetary resources from Linaro, and free EC2 slots from other services. Sandbox lifetime should be minimized, recommended duration is 1-2 days. Absolute maximum is 5 days, after which a sandbox may be deleted without a notice.
  • Cloud is not 100% reliable, and Linaro cloud resources are limited, so a sandbox may be terminated due to error or to free up resources at any time. Please don't leave important assets on sandboxes for prolonged time, and watch IRC for important announcements.

2.4. The Jenkins Configuration

setup-control-node by default installs minimal Jenkins config suitable for cloud-buildd (all EC2 credentials must be input manually after the installation though).

The https://wiki.linaro.org/Platform/Android/Specs/LinaroAndroidBuildInfrastructure/Backend#Configuring%20jenkins contains original details on the settings in it (outdated now).

The config of production (android-build.linaro.org) Jenkins is stored in a private branch, lp:~linaro-infrastructure/linaro-android/jenkins-config. This is pushed to by a rather rube-goldberg ad hoc mechanism: Log into the server (with ssh -A, so you can use your keys to push to lp), run 'bzr add; bzr ci' as jenkins in ~jenkins, then as ubuntu, run bzr up in ~jenkins-config and finally push this branch to Launchpad (you will need to set your LP username with "bzr launchpad-login <user>" first; afterwards always check it with "bzr launchpad-login", as this is shared account). This could definitely use some automation! Also, the .bzrignore in ~jenkins should almost certainly be ignoring more stuff than it does already.

3. Known Issues

3.1. Space

Jenkins allows keeping a certain number of artifacts per build or keeping artifacts for a certain number of days. You can independently set how long to keep the other build bits (log files, etc) for.

We currently have 500Gb partition to store build archives, which is projected to be enough for 1 year archive storage at the sustained build rate we have.

There are also other options how to handle build artifacts archiving:

  1. Mirror the artifacts to snapshots.linaro.org and don't persist them in jenkins for very long
  2. Store artifacts in S3, even allowing people to provide their own credentials if they want artifacts to persist for longer.

3.2. Mirror service error handling

The mirror service currently doesn't check if the calls it makes to repo sync don't hang, which may lead to issues, but such cases were never actually seen in practice.

3.3. Mirror service stale timeouts

To avoid excessive upstream server hitting, mirror service allow per-upstream stale timeout configuration. For example, it's possible to configure mirror of particular host to not update if was already updated within last 5s or 24hr. This may lead to missing the newest commits, so it is expected that good consideration will be applied to setting timeout (hours/up to a day for mostly static AOSP version branches, order of seconds for *.git.linaro.org, inbetween for active 3rd party projects).

Platform/Android/LinaroAndroidBuildService (last modified 2013-08-27 11:17:59)