Is there hard evidence of the ROI of unit testing?

Unit Testing Problem Overview

Unit testing sounds great to me, but I'm not sure I should spend any time really learning it unless I can convince others that is has significant value. I have to convince the other programmers and, more importantly, the bean-counters in management, that all the extra time spent learning the testing framework, writing tests, keeping them updated, etc.. will pay for itself, and then some.

What proof is there? Has anyone actually developed the same software with two separate teams, one using unit testing and the other not, and compared the results? I doubt it. Am I just supposed to justify it with, "Look it up on the Internet, everybody's talking about it, so it must be the right thing to do"?

Where is the hard evidence that will convince the laymen that unit testing is worth the effort?

Unit Testing Solutions

Solution 1 - Unit Testing

Yes. This is a link to a study by Boby George and Laurie Williams at NCST and a another by Nagappan et al. I'm sure there are more. Dr. Williams publications on testing may provide a good starting point for finding them.

[EDIT] The two papers above specifically reference TDD and show 15-35% increase in initial development time after adopting TDD, but a 40-90% decrease in pre-release defects. If you can't get at the full text versions, I suggest using Google Scholar to see if you can find a publicly available version.

Solution 2 - Unit Testing

" I have to convice the other programmers and, more importantly, the bean-counters in management, that all the extra time spent learning the testing framework, writing tests, keeping them updated, etc.. will pay for itself, and then some."

Why?

Why not just do it, quietly and discretely. You don't have to do it all at once. You can do this in little tiny pieces.

The framework learning takes very little time.

Writing one test, just one, takes very little time.

Without unit testing, all you have is some confidence in your software. With one unit test, you still have your confidence, plus proof that at least one test passes.

That's all it takes. No one needs to know you're doing it. Just do it.

Solution 3 - Unit Testing

I take a different approach to this:

What assurance do you have that your code is correct? Or that it doesn't break assumption X when someone on your team changes func1()? Without unit tests keeping you 'honest', I'm not sure you have much assurance.

The notion of keeping tests updated is interesting. The tests themselves don't often have to change. I've got 3x the test code compared to the production code, and the test code has been changed very little. It is, however, what lets me sleep well at night and the thing that allows me to tell the customer that I have confidence that I can implement the Y functionality without breaking the system.

Perhaps in academia there is evidence, but I've never worked anywhere in the commercial world where anyone would pay for such a test. I can tell you, however, that it has worked well for me, took little time to get accustomed to the testing framework and writing test made me really think about my requirements and the design, far more than I ever did when working on teams that wrote no tests.

Here's where it pays for itself: 1) You have confidence in your code and 2) You catch problems earlier than you would otherwise. You don't have the QA guy say "hey, you didn't bother bounds-checking the xyz() function, did you? He doesn't get to find that bug because you found it a month ago. That is good for him, good for you, good for the company and good for the customer.

Clearly this is anecdotal, but it has worked wonders for me. Not sure I can provide you with spreadsheets, but my customer is happy and that is the end goal.

Solution 4 - Unit Testing

We've demonstrated with hard evidence that it's possible to write crappy software without Unit Testing. I believe there's even evidence for crappy software with Unit Testing. But this is not the point.

Unit Testing or Test Driven Development (TDD) is a Design technique, not a test technique. Code that's written test driven looks completely different from code that is not.

Even though this is not your question, I wonder if it's really the easiest way to go down the road and answer questions (and bring evidence that might be challenged by other reports) that might be asked wrong. Even if you find hard evidence for your case - somebody else might find hard evidence against.

Is it the business of the bean counters to determine how the technical people should work? Are they providing the cheapest tools in all cases because they believe you don't need more expensive ones?

This argument is either won based on trust (one of the fundamental values of agile teams) or lost based on role power of the winning party. Even if the TDD-proponents win based on role power I'd count it as lost.

Solution 5 - Unit Testing

More about TDD than strictly unit testing, here is a link to the Realizing quality improvement through test driven development: results and experiences of four industrial teams paper, by Nagappan, E. Michael Maximilien, Thirumalesh Bhat, and Laurie Williams. paper published by Microsoft Empirical Software Engineering and Measurement (ESM) group and already mentionned here.

The team found was that the TDD teams produced code that is between 60% and 90% percent better (in terms of defect density) than non-TDD teams. However TDD teams took between 15% and 35% longer to complete their projects.

Solution 6 - Unit Testing

If you are also interested in evidence against unit testing here is one well researched and thought out article:

Why Most Unit Testing is Waste By James O Coplien (lean and agile guru)

Solution 7 - Unit Testing

Here's a great and entertaining read of a guy changing his company from within. It's not limited to TDD. http://jamesshore.com/Change-Diary/ Note that he didn't persuade the "bean counters" for quite some time and did "guerilla tactics" instead.

Solution 8 - Unit Testing

Just to add more information to these answers, there are two meta-analysis resources that may help out figuring out productivity & quality effects on academic and industry background:

Guest Editors' Introduction: TDD—The Art of Fearless Programming [link]

> All researchers seem to agree that TDD encourages better task focus > and test coverage. The mere fact of more tests doesn't necessarily > mean that software quality will be better, but the increased > programmer attention to test design is nevertheless encouraging. If we > view testing as sampling a very large population of potential > behaviors, more tests mean a more thorough sample. To the extent that > each test can find an important problem that none of the others can > find, the tests are useful, especially if you can run them cheaply. > > Table 1. A summary of selected empirical studies of test-driven > development: industry participants* > > > > Table 2. A summary of selected empirical studies of TDD: academic > participants* > >

The Effects of Test-Driven Development on External Quality and Productivity: A Meta-Analysis [link]

Abstract:

> This paper provides a systematic meta-analysis of 27 studies that > investigate the impact of Test-Driven Development (TDD) on external > code quality and productivity. > > The results indicate that, in general, TDD has a small positive effect on quality but little to no discernible effect on productivity. However, subgroup analysis has > found both the quality improvement and the productivity drop to be > much larger in industrial studies in comparison with academic studies. > A larger drop of productivity was found in studies where the > difference in test effort between the TDD and the control group's > process was significant. A larger improvement in quality was also > found in the academic studies when the difference in test effort is > substantial; however, no conclusion could be derived regarding the > industrial studies due to the lack of data. > > Finally, the influence of > developer experience and task size as moderator variables was > investigated, and a statistically significant positive correlation was > found between task size and the magnitude of the improvement in > quality.

Solution 9 - Unit Testing

There are statistics that prove that fixing a bug found in the unit/integration test costs many times less than fixing once it's on the live system (they are based on monitoring thousand of real life projects).

Edit: for example, as pointed out, the book "Code Complete" reports on such studies (paragraph 20.3, "Relative Effectiveness of Quality Techniques"). But there is also private research in the consulting field that proves that as well.

Solution 10 - Unit Testing

Well, there are some large companies that require you to use unit testing but if you are a small company why mimic large ones?

For me when I started with unit testing , many years ago,(today we mostly use behavior model) it was because I could not control all the path in one application.

I was used to bottom first programming and a REPL so when I got Unit Test (One Test for Every Function) it was like bringing back a REPL to languages that where very much compile. It brought the fun back to every line of code I wrote. I felt god. I liked it. I didn't need a report to tell me that I begun writing better code faster. My boss didn't need a report to notice that because we where doing crazy stuff we suddenly never missed a deadline. My boss didn't need a report to notice that the number of "plain" bugs drop from (to many) to nearly nil because of this very strange thing of writing non-productive code.

As another poster already wrote, you don't use TDD to Test (verify). You write it to capture the specification, the behaviour of what your unit(object, module, function, class, server, cluster) works.

There are lot of failures and success stories of switching to a different model of developing software in a lot of companies.

I just started to use it whenever I had something new to write. There is a old saying that goes somewhat hard for me to translate to english but:

> Start with something so simple that > you don't notice that you do it. > When training for a marathon, start by walking 9 meters and run 1 > meter, repeat.

Solution 11 - Unit Testing

I do have one set of data points for this - from an experience that sold me on unit tests.

Many moons ago I was a fresh graduate working on a large VB6 project and had occasion to write a large body of stored procedure code. Of the subsystem I was writing it made up about 1/4 of the whole code base - around 13,000 LOC out of 50K or so.

I wrote a set of unit tests for the stored procedures but unit testing VB6 UI code is not really feasible without tools like Rational Robot; at least it wasn't back then.

The statistics from QA on the piece were that about 40 or 50 defects were raised on the whole subsystem, of which two originated from the stored procedures. That's one defect per 6,500 lines of code vs. 1 per 1,000-1,200 or so across the whole piece. Bear in mind also, that about 2/3 of the VB6 code was boilerplate code for error handling and logging, identical across all of the procedures.

Without too much handwaving you can ascribe at least an order-of-magnitude improvement in defect rates to the unit testing.

Content Type	Original Author	Original Content on Stackoverflow
Question	raven	View Question on Stackoverflow
Solution 1 - Unit Testing	tvanfosson	View Answer on Stackoverflow
Solution 2 - Unit Testing	S.Lott	View Answer on Stackoverflow
Solution 3 - Unit Testing	itsmatt	View Answer on Stackoverflow
Solution 4 - Unit Testing	Olaf Kock	View Answer on Stackoverflow
Solution 5 - Unit Testing	philant	View Answer on Stackoverflow
Solution 6 - Unit Testing	mmm	View Answer on Stackoverflow
Solution 7 - Unit Testing	Epaga	View Answer on Stackoverflow
Solution 8 - Unit Testing	Dariusz Woźniak	View Answer on Stackoverflow
Solution 9 - Unit Testing	friol	View Answer on Stackoverflow
Solution 10 - Unit Testing	Jonke	View Answer on Stackoverflow
Solution 11 - Unit Testing	ConcernedOfTunbridgeWells	View Answer on Stackoverflow