EC2 Best Practices and Known Issues
Jenkins-based Build Services
Jenkins Known Issues with EC2
On page like https://android-build.linaro.org/jenkins/computer/ , slave types are encoded in <select> by the AMI they use. That means that if there're 2 different slave types using the same AMI, then that page effectively cannot distinguish between them.
Jenkins master 3-volume setup
One of the issue we experienced with Jenkins master is that build archive may overflow disk, and Jenkins doesn't have space to store its current state, like keep track of started build slaves. That causes it to go into vicious circle: it doesn't see existing slaves, so proceeds to start a new one, but can't record that fact, so goes on to start another one. This leads to "zombie slave storms", limited only by instance caps and easily can lead to 20-50 instances spawned in vein until caught.
The solution to this problem is to keep Jenkins build archive (jobs/ subdirectory of JENKINS_HOME) on a separate partition, so even if it overflow, JENKINS_HOME partition is not affected and Jenkins doesn't lose control of its build slaves. So, following mount scheme was established:
- / - System (OS) volume, 8GB. We keep Jenkins separate from it to ease OS upgrades (OS partitions change, Jenkins partitions stay)
/mnt2 - Jenkins home volume 2Gb (Jenkins files actually take <500MB). /var/lib/jenkins symlinks to /mnt2/jenkins (not mounted there directly by historical and maintenance reasons).
- /mnt2/jenkins/jobs - Jenkins jobs volume (100's of GBs)
One issue with such setup is that, besides jobs/builds archive, we also usually have auxiliary data on master which potentially may grow unbound, e.g. mirrors of repositories/tarballs. Thus, they can't be kept on Jenkins home volume. To avoid volume count proliferation, they would fit on Jenkins jobs volume, except that that volume has individual job directories on the top level (symlinks to jobs/ dir didn't work, so we can't organize it like Jenkins home volume).
So, there's no elegant solution to this problem, but following practical way was used: on the top level of jobs volume, _extra directory was created, with any auxiliary directories to be put under it, and then symlinked from the appropriate places. For example, on android-build we have:
$ ls -l /mnt2 total 20 drwxr-x--x 16 jenkins adm 4096 2012-05-08 11:03 jenkins drwx------ 2 root root 16384 2012-05-04 19:42 lost+found lrwxrwxrwx 1 root root 24 2012-05-04 08:42 seed -> jenkins/jobs/_extra/seed
Platform/Systems/EC2BestPractices (last modified 2014-06-24 17:47:44)