So your CI server is in place. Developers are checking in their code on a daily basis. And then an email notification goes out to everybody, FAILED BUILD.
What do I do?
Fix the build right away. Unless of course there is a production issue. The sooner you fix the build the less time it will take.
What don’t I do?
Don’t comment out the test. For Java developers that means you should refrain from adding the @Ignore annotation to the test and checking it in. The test failed for a reason. Either the test itself needs to be updated or there is a real problem with the application code. In either case, it should addressed right away.
How do I prevent this from happening?
Use information radiators. Have a whiteboard set up with everybody’s name. When somebody breaks the build, add a tick next to their name. At the end of the week, whoever has the most ticks is responsible for bringing in donuts the following week. In the event of a tie, the more senior developer is responsible (just because they should know better). You can use your own variation of this but you get the idea. This is a great way of encouraging people to run their tests before they check-in code. Also, penalizing people in a fun way has two main benefits. First it gets the point across, and second it’s done in a non-confrontational way.
Perform root cause analysis. Keep asking why until you figure out what the real problem is.
Why did the tests fail? Because I didn’t run them locally.
Why didn’t you run them locally? Because the tests take too long to run.
Why do the tests take too long to run? Because the system tests perform end to end testing.
Maybe we should have a lightweight build that only runs unit tests and a separate heavyweight build that runs all tests on a nightly basis.
You should only put in measures to track broken builds if it becomes a common theme. Remember, you want to make this a rare occurrence. Once you get to that point you no longer need to track it. So don’t encourage it.