>> Wednesday, November 25, 2009
Hi Boris, Chris, Doug and Ed
Hey, how's it going. I hope that you are all well.
Like a mall on Black Friday, around here it's peak Eclipse committing season. Lots of bugs to fix. Lots of builds to run. As you know, builds are quite a pain point at Eclipse. I'm excited about the possibilities in the b3 project to make things better. Building software is complex, and Eclipse is no exception. In additional to tooling to make builds easier, we need hardware to make builds faster. Our build today takes about five hours to complete, and an additional 6.5 hours for the tests to complete. Really, it's not pretty.
Many eclipse projects run their builds on Hudson on build.eclipse.org. Hudson is fantastic because there's a rich set of plugins that you can use to enhance the functionality of your build. Also, since this server has local access to the eclipse.org filesystem for code checkouts, you're less prone to network errors which can break the build. It also has ldap integration with your commiter login so you can restrict your build configuration to the commiters on your project. In theory, if you need more build machines to run your build - you can use the Amazon EC2 plugin to provision more machines in the cloud, or other plugins to start builds on local slave machines. Good stuff.
However, one of the things that the foundation doesn't provide today is test machines. This means that we can't run our build at the Eclipse foundation. The Eclipse Project builds zips for 14 different platforms. We run JUnit tests on three native platforms: Windows, Linux and Mac. They are the most commonly downloaded platforms. We need test machines to ensure that we don't have any bugs specific to a platform. Why do our tests take so long? We have 54,000 JUnit tests. You don't produce quality software by skimping on tests.
This isn't just about the Eclipse and Equinox projects. This could be very useful for other projects, for instance, the XSL tools project has expressed interest in using test servers. In addition, these machines could be used as slaves machines for running the build in the event that the main Hudson server is too busy. If we had enough machines, we could run more tests in parallel and reduce the time it takes our build to complete. This would be a big win for the community and our committers.
One thing I investigated in running tests in the cloud. However, most cloud services don't have provide a way to run tests on Macs and we need to make sure that our Mac users are happy. If there is a way, I'd appreciate a link. In addition, one of the advantages of running tests on machines local to the eclipse.org filesystem is that we don't spend time copying stuff back and forth across the network. It's just there.
So, what I'm asking from you is at the next board meeting, please bring up the issue of funding test infrastructure at the Eclipse foundation. It might be even be an advertising opportunity for one of the member companies if they donated hardware. Other companies could donate money to pay for the additional rack space. I don't know right now what the final technical solution will be or what it will cost. All I'm asking right now is to start the conversation.
For many years, the Eclipse project has been criticized for not being open enough. Having our build process fully on eclipse.org servers would make us more open. It would also allow any of the Eclipse and Equinox project committers, regardless of company affiliation, to initiate a build. It we had enough hardware, our build could be faster and we could spend less time waiting for builds, and more time fixing bugs the builds reveal.
Please bring this issue up and the next board meeting.
P.S. Right now we have the following test machines and our tests take about 6-8 hours to complete. Obviously, if we had more machines running tests in parallel, the build would take less time.
1) JUnit: 2 linux, 2 windows, 1 mac, 1 test cvs server,
2) Performance: 2 windows, 2 linux, 1 database server