In honor of Docker Global Hack Day, here's a recipe for running your Ruby tests in parallel using Docker and Rake on your local development machine.
This was inspired by Nick Gauthier's great post detailing how to parallelize Rails tests using a shell script and Docker. My use case was slightly different (Sinatra, not Rails) and I wanted to have my dependencies running in multiple docker containers instead of a single one. I also wanted to avoid the cost of running `bundle install` each time the tests run. This means that the ruby tests will run outside of Docker, pointing to the running Docker containers for their infrastructure dependencies.
To do this, I'm using Vagrant in OSX using the Docker Vagrant distribution. This means that I had to customize the Vagrant VM to include the version of Ruby I needed, install bundler there, and map a directory to my host file system so I could run tests on the same code I am working on.
Clearly, this won't isolate the Ruby environment in its own Docker container. However, we don't typically experience problems that stem from differences in Ruby or Gems, so this tradeoff was one I was willing to make.
Using this technique you can run multiple sets of Docker containers, each running a discrete portion of your infrastructure (in my case it's Redis and Solr). I'm going to assume you have your own Docker images configured in a working Docker installation (I used Docker 0.6 and 0.7 for my tests).
We'll be creating a custom Rake task to start the docker containers and run the tests, here's a sample:The rake task can be configured to run as many forks as necessary. Remember, your entire test time will only be as short as your longest individual test, so you may want to break test files down to improve the overall efficiency. The Rake task invokes Docker using a system call and I found it was important to let the fork sleep for a bit to allow the process within the container to start.
We use MiniTest (included in Ruby 2.0.0) for our test framework and this using the `MiniTest::Unit.runner.run` method invokes the test runner explicitly.
Getting better outputOne of the first things you will notice is that the output from the tests running in each fork gets output as the test run, so tests in fork 0 will print output alongside tests from fork 3. This makes it hard to track down errors and view the overall test results. To get around this, let's modify the code to redirect STDOUT and STDERR for each fork while the tests run:
Now all of the output for each fork gets printed together.
Making sure we get the outputThe final step is to make sure our output is available, even when we hit an exception somewhere (hidden errors in tests are very bad). We can do this using a try/catch:
You can view the whole finished task as well.
How much better is it?In short, a lot. It will depend on your specific scenario, however. In my case, I saw average test times drop from 25-30 minutes to 3-5 minutes. There are some opportunities for improvement as some forks always finish well before others. Splitting test files into smaller, more discrete groups of methods would help distribute the load a bit, but again the whole process can only be as short as your longest test file.
Suggestions for improvements are welcome. You can find me at @palexander.