This is landing page for Ansible configuration management system, as leveraged by number of LAVA/Infrastructure/ITS/DevOps projects in Linaro.

Quick Links

Guidelines & Best Practices

Ansible best practices as followed by the team:

  • Ansible configuration for all services should be stored in the common repository - ansible-playbooks

    • Previously, ansible configs was stored in the ansible/ directory at the top level directory of each service main repository. This has proven to not scale too well to have complete, easily accessible infrastructure coverage.

  • We aim at having a reusable, common role repository, and individual per-service playbooks - all at the top-level
    • Playbooks migrated from per-repository structure noted above still live in individual subdirectories under per-service dir, but are to be refactrored.

  • Playbooks should be --check mode friendly. Unfortunately, this requires some additional care (especially to support running in check mode against bare server):
    • file state=link should use force=yes

    • file recurse=yes should have explicit state=directory

    • appropriate tasks should be tagged as "deps" (see below)
  • Task names should be less than 80 characters long and be descriptive
    • Use sentence capitalization
    • Do not use append a full stop
  • Playbooks should target to achieve 100% idempotence. I.e., after playbook had run and performed some actions, second run should report 0 changes; any changes reported should be "real" changes. The motivation for that should be obvious: playbook which runs and performs changes all the time raises suspicions of its correct functioning. Later, user gets used to ignore changes reported by playbook, and may overlook important changes made. Achieving 100% idempotence with Ansible isn't easy, but we should try, and one way to achieve that is to use KISS, rather than complex, approaches to different tasks.
  • Handler names should be identity-like, all lower case, dash-separated
    • To name a handler, use first the action that will be performed and then the service name: restart-apache, stop-apache, start-nginx, etc...

Tag conventions

  • deps
    • Tasks which install any dependencies for entities being set up (including other tasks), should have this tag. Oftentimes, on a system which wasn't fully installed, running portitions of a playbook may fail due to missing dependencies. In this case, running tasks with tag should help. It may be not immediately clear where there's edge between "deps" and non-"deps" tasks, especially taking into account that that entire playbook is set of layer dependencies for a particular service. For example, installing packages is apparently a dependency. Then installing a particular PPA to install packages from it is a "deps" too. Many actions, even in --check mode, will fail if user they reference doesn't exist. So, creating system-level users can be considered "deps" too. Copying few core files around or creating symlinks on which other more "semantic" tasks depend can be considered "deps" too (after all, that's pretty close to what installing packages does). However, that's where it probably stops. For example, creation of databases is not a dependency of setting up a service, it is setup of an actual service. Generally, the idea behind "deps" tasks is that after running a playbook with --tag=deps, it should be possible to run the same playbook with --check even on a fresh server. However, due to peculiarities of --check handling for some Ansible actions, that may not achievable in all cases, and bloating "deps" task set should not be the way to achieve that by any means.

Known Issues/Topic to Resolve

1. As described above, currently Ansible configs are mostly hosted with service source code repositories. This suits well development usage, but not ideal for production usage, where centralized access/control is required. It would be nice to find some structuring scheme which would allow to satisfy both requirements. - RESOLVED, moved to centralized ansible-playbooks repo.

2. Credential storage details. Ansible allows to well and flexibly separate security/private credentials from the rest of deployment configuration, so this is mostly about settling on specific partitioning structure, as well as specific storage requirements for security part. - RESOLVED, but open to reconsideration as new options appear (like Ansible Vault).

History and Requirements

Question of systems configuration management was discussed at Linaro Connect US 2013. Following requirements were set forth for a solution:

  • Should be easy to bootstrap/deploy.
    • Ideally, should reuse existing networking and authentication infrastructure.
  • Should support variety of deployment scenarios from the same config (parametrized as needed):
    • "Development" deployment to a locally run VM/container.
    • "Development"/"Sandbox" deployments to the Cloud.
    • "Production" deployments to the Cloud or dedicated servers.
    • Easy to redeploy to new server.
  • Should be modern solution, building on best practices and known issues of the previous generation of tools.
  • Should be well-maintained, with active developers and community.

The main contenders were Ansible and Salt.

Ansible was selected in particular because of ease of deployment: it can be immediately run out of git checkout and can be easily made to not require any global configuration files.

Ansible natively uses SSH as the communication protocol.

It was decided to use Ansible for deployment of services which previously were not using any configuration management system. For example, LAVA Lab, which uses Salt, will keep using it.

Pilot project to use Ansible was Android Build service "sandbox" setup, previously implemented as adhoc shell script. Setups for many other services followed.

Platform/Systems/Ansible (last modified 2015-01-13 20:27:02)