Challenges in Release Engineering

>> Sunday, June 24, 2012

I've been at Mozilla since the end of April.  The learning curve is steep but I'm having fun climbing.  My coworkers are very friendly and helpful while answering my barrage of questions with respect to how things work.   One of the things I've noticed is that there are many common challenges in release engineering, no matter what you're building. Here's my list so far:

1. Signing builds is like a falafel sandwich.  (Always includes some pita).

Image ©chotda, http://www.flickr.com/photos/santos/3531853459/sizes/z/in/photostream/ licensed under Creative Commons by-nc-sa 2.0

2.  Scaling your build to manage infrastructure utilization, tooling to manage that infrastructure and optimizing build parallelization is extremely challenging.  Developers will consume all available build infrastructure and then ask for more.  Scaling build infrastructure to accommodate future growth is an ongoing process. 

3. Proliferating numbers of platforms on which to build, test and run performance metrics add complexity.


 Image ©misterbisson, http://www.flickr.com/photos/maisonbisson/109211670/sizes/o/in/photostream/ licensed under Creative Commons by-nc-sa 2.0 
4.  Update game adventures.  When you have an open platform that included the installation of third-party components, it's inevitable that you will encounter unexpected update cases in the wild that weren't reflected in test cases.

5. Frequent releases generate faster user feedback on new features.  However, additional releases are expensive for the release engineering, quality assurance and release management teams.  Each additional release eats time, both people and machine.  Making the community aware that builds are not free is an ongoing communication exercise.

Image ©aarongeller, http://www.flickr.com/photos/aarongeller/360135019/  licensed under Creative Commons by-nc-sa 2.0
6.  You can have great documentation and process but the accumulated technical and tribal knowledge required to resolve a complex and broken build quickly is not earned by anything other than experience.

  Image ©Ian Muttoo, http://www.flickr.com/photos/imuttoo/2631466945/sizes/z/in/photostream/ licensed under Creative Commons by-nc-sa 2.0 
7.  If you can compile your code, this doesn't mean there won't be issues with packaging, signing and testing it.  Making a build available in a format for millions of users to consume > code that compiles.  Education is needed to make people aware of this distinction. 

What release engineering challenges do you face?

7 comments:

Anonymous,  1:07 PM  

I think the biggest challenge is getting the people responsible for training programmers --- i.e., university profs --- to accept that this is a challenging problem that deserves time in the curriculum (and more serious research attention). The next time I update http://third-bit.com/articles/not-on-the-shelves-2009.pdf, a book like third-bit.com/articles/not-on-the-shelves-2009.pdf on packaging, releasing, and installing will definitely make an appearance.

Anonymous,  1:07 PM  

Whoops, copy-and-paste error: the second link in the previous comment should be http://www.amazon.com/Software-Build-Systems-Principles-Experience/dp/0321717287

reprogrammer 1:25 PM  

What makes auto-parallelizing the build challenging? Isn't it possible to automatically parallelize the build by analyzing the dependency graph of the build system?

Kim Moir 1:49 PM  

Thanks Greg, that's a good point I missed. There aren't any formal training programs for this type of work although several Mozilla release engineers who have come from the excellent program at Seneca College.

reprogrammer: With respect to parallelization, yes definitely analyzing the dependency graph is useful. I just find it's always something in the back of your mind, how can you parallelize more tasks to reduce total build time? Not just compilation, sigining, packaging etc.

reprogrammer 1:57 PM  

Kim Moir: Just trying to understand, do you mean that it's difficult to auto-parallelize the build because not all dependencies are explicitly described in a form that is easy to analyze by a machine? How do you currently improve the parallelism of your build system?

Anonymous,  3:39 PM  

"5. Frequent releases generate faster user feedback on new features. However, additional releases are expensive for the release engineering, quality assurance and release management teams. Each additional release eats time, both people and machine.

Making the community aware that builds are not free is an ongoing communication exercise."


I've never committed code to Mozilla, but I'm not sure I understand your point here.

I understand that RELEASES need testing etc and therefore use people time, but surely additional BUILDS only use up extra machine time? Surely splitting a commit across two builds could even reduce people time, by making it more obvious which change caused the regression.

Kim Moir 5:45 PM  

reprogrammer: Speaking from my experience as an Eclipse release engineer, yes the dependencies are expressed in a machine readable form to compile the source code into OSGi bundles. However, there was not any machine readable output that could be parsed to optimize the signing, testing or performance tests automatically. This was just an Ant script that called these modules manually. So no magic there.

Ian: Perhaps I should have used the word releases there to be more clear. However, builds are not free. Intermittent problems can occur - such as hung slaves, network issues, hardware and power failures require human intervention. So in theory, a build should always succeed and has little incremental cost assuming that your infrastructure can handle the additional load. However, in practice, this is not always the case :-)

Post a Comment

  © Blogger template Simple n' Sweet by Ourblogtemplates.com 2009

Back to TOP