Releng of the Nerds: Hudson best practices and Eclipse

One of the plan items for Eclipse 3.7 I've assigned is to migrate the Eclipse and Equinox build to run on the Hudson installation at eclipse.org. So I've spent a lot of quality time with Hudson lately. (Soon to be called Jenkins).

Today, I attended a webinar about optimizing Hudson in production hosted by Kohsuke Kawaguchi of CloudBees. Since my handwriting is similar to the last slide in this Oatmeal cartoon, I thought it would be better to jot my notes down in a blog post. As I was listening to his suggestions, I thought about how his seven best practices apply to our use of Hudson at Eclipse.

1. Backups

Backup your Hudson install to protect from disaster and accidental configuration changes.

Image © davemorris, http://www.flickr.com/photos/davemorris/2395980976/sizes/z/in/photostream/ licensed under Creative Commons by-nc-sa 2.0

$HUDSON_HOME on the master stores everything that needs to be backed up. The slaves doesn't need to be backed up.
Live backups are fine as configuration changes are atomic
In addition to the backup of the configuration, a backup of the filesystem is necessary to provide consistent snapshot.
Backups are great but can you restore? On a regular basis, try restoring your hudson configuration and starting up a test hudson instance with the restored hudson.war file.

2. Disk

Prepare for disk utilization growth
Space is more important than speed when ordering new drives
Plan for expandable disk volumes
If it's too late, symlink is your friend

3. Use native packages

For instance, *.deb, *.rpm. Easy to update and install, and they include init scripts.
Configuration is stored in /etc/default/hudson and /etc/sysconfig/hudson
Existing $HUDSON_HOME can be migrated
Native Windows package is a work in progress

@Eclipse: Backups and disk configuration are managed by the webmasters, along with upgrading and installing hudson.

4. Distributed builds

You will grow beyond a single system
Increasing load on a single machine is not the only issue
It's cheaper to grow horizontally (additional machines) versus vertically (more memory and CPUs on a single machine)
It's also good to have isolation between build processes in case you want to diversify the test and build platforms
Let master launch slaves via SSH or DCOM. This makes it easier to keep the cluster up and running.
SSH public key authentication makes it easy to connect to slaves on disparate platforms.

@Eclipse: Yes, we have distributed builds and several platforms available for test purposes.

5. Labels

Treat build machines like livestock, not like pets

Image © davegroth, http://www.flickr.com/photos/davegroth/9107888/sizes/m/in/photostream/ licensed under Creative Commons by-nc-sa 2.0

Build machines should be interchangeable
Don't tie builds to a specific build machine, unless there are platform limitations. This approach will yield better resource utilization and less downtime. Queued builds can use the next available machine instead of waiting for a specific machine.
When using labels for your Hudson slaves, name them to reflect their capability. For instance, if the slave is Mac or Windows hardware, the label should reflect this.
If there is a real reason to limit build to run on a single machine (such as wanting to run platform specific tests), tou can use boolean expressions with labels. For instance, this Hudson job shouldn't run the slaves running Mac or Windows

@Eclipse: We use label our build machines as livestock. No pets for us. However, I'm quite envious that the Mozilla foundation has adopted red pandas (firefoxes).

6. Invest in a good URL

By default Hudson runs on port 8080. This is difficult for users to remember.
Use the host's alias instead of the machine name. Thus if the machine names changes, the users won't have to go to a new URL.
Share port 80 with other applications using Apache reverse proxy with which will also allow you to run Hudson as a non-root user.

@Eclipse: https://hudson.eclipse.org is short and simple.

7. Try to keep the build records under control.

Discard old build records if you can
Removing old builds reduces start-up time and memory usage.
Settings are on a per-project basis. For instance, here are settings for one of my Eclipse builds.

@Eclipse: Yes, we do clean up old builds. Also, the webmasters have created a cron job to send a friendly email every week listing of the Hudson disk utilization to remind the release engineers to clean house.

The slides are available here and obviously provide a lot more detail than my short summary. If you're interested the best practices for configuring Hudson for a production environment that has room to grow, this presentation is provides very useful suggestions.

4 comments:

Mickael Istria, 2:01 AM: Hi,

This is indeed a collection a good practices, With this post, I could learn some new tips with Hudson. Thanks.
Something we do at BonitaSoft instead of backups is that we store all the jobs configuration (HUDSON_HOME) on our SVN. Then we have backups and versionning, and setting up a new Hudson instance is simply about checking-out the HUDSON_HOME on SVN. Propagating modifications becomes just a simple commit.
Kim Moir, 4:41 PM: Mickeal, Good idea! That way you can easily roll back to a previous configuration to recreate an older stream build. Thanks for the tip.
Anonymous, 12:41 AM: to 7.: Removing old builds implies also clipping the build time trend statistic. There is a new option if you like to keep the small logs for the statistic but remove the artifacts: Increase the number of builds to keep, click the Advanced... button and specify a lower number of builds to keep with artifacts.
Morten, 2:50 AM: Like Mickael said. An alternative is to keep hudson under source-control. Your entire system can the be scaled out automatically and any change is version controlled.

However if you want to keep builds and their artifacts then you still need to do backups. You don't want to keep the artifacts and buildlogs in source-control.

We've been setting this up for a while, feel free to read my post about it: http://www.morkeleb.com/2011/01/17/doing-continuous-delivery-with-hudson-and-git/

There is a version of it on github where the .gitignore file is resonably well up to date.

Releng of the Nerds

Hudson best practices and Eclipse

>> Wednesday, January 19, 2011

4 comments:

Post a Comment

Blog Archive