When to rewrite a code base from scratch

ArchitectureTddTesting

Architecture Problem Overview


I think back to Joel Spolsky's article about never rewriting code from scratch. To sum up his argument: The code doesn't get rusty, and while it may not look pretty after many maintenance releases, if it works, it works. The end user doesn't care how pretty the code is.

You can read the article here: Things You Should Never Do

I've recently taken over a project and after looking through their code, it's pretty awful. I immediately thought of prototypes I had built before, and explicitly stated that it should not be used for any production environment. But of course, people don't listen.

The code is built as a website, has no separation of concerns, no unit testing, and code duplication everywhere. No Data layer, no real business logic, unless you count a bunch of classes in App_Code.

I've made the recommendation to the stake holders that, while we should keep the existing code, and do bug fix releases, and some minor feature releases, we should start rewriting it immediately with Test Driven Development in mind and with clear separation of concerns. I'm thinking of going the ASP.NET MVC route.

My only concern is of course, the length of time it might take to rewrite from scratch. It's not entirely complicated, pretty run of the mill web application with membership, etc..

Have any of you come across a similar problem? Any particular steps you took?

UPDATE:

So.. What did I end up deciding to do? I took Matt's approach and decided to refactor many areas.

  • Since App_Code was getting rather large and thus slowing down the build time, I removed many of the classes and converted them into a Class Library.

  • I created a very simple Data Access Layer, which contained all of the ADO calls, and created a SqlHelper object to execute these calls.

  • I implemented a cleaner logging
    solution, which is much more concise.

While I no longer work on this project [funding, politics, blah blah], I think it gave me some enormous insight into how bad some projects can be written, and steps one developer can take to make things a lot cleaner, readable and just flat out better with small, incremental steps over time.

Architecture Solutions


Solution 1 - Architecture

Just because it has all those problems now doesn't mean it has to continue to have them. If you find yourself making a specific bug fix in the system that could benefit from, say, a new data layer, then create a new data layer. Just because the whole site doesn't use it doesn't mean you can't start using one. Refactor as you need to during your bug fixes. And make sure you understand exactly what the code is doing before you change it.

Problem with code duplication? Pull it out into a class or utility library, in a central location next time you have to fix a bug in the duplicated code.

And, as already mentioned by other responders - start writing tests now. It may be hard if the code is a coupled as it sounds, but you can probably start somewhere.

There is no good reason to rewrite working code. However, if you are already fixing a bug, there is no reason you can't rework that specific part of the code with a "better" design.

Solution 2 - Architecture

Joel's article really says it all.

Basically never.

As Joel points out: you'll simply lose too much doing it from scratch. It'll probably take way longer than you think and what's the end result? Something that basically does the same thing. So what's the business case for doing it?

That's an important point: it costs money to write something from scratch. How will you recoup that money? Many programmers ignore this point simply because they don't like the code--sometimes with justification, sometimes not.

Solution 3 - Architecture

The book [Facts and Fallacies Of Software Engineering][1] states this fact: "Modification of reused code is particularly error-prone. If more than 20 to 25 percent of a component is to be revised, it is more efficient and effective to rewrite it from scratch." The numbers come from some statistical studies performed on the subject. I think the numbers may vary due to the quality of the code base, so in your case, it seems to be more efficient and effective to rewrite it from scratch by taking this statement into account.

[1]: http://www.amazon.com/Facts-Fallacies-Software-Engineering-Development/dp/0321117425 "Facts And Fallacies Of Software Engineering"

Solution 4 - Architecture

I have had such an application, and rewrite was very rewarding. However, you should try to aviod the "improvement" trap.

When you rewrite everything, it is very tempting to add new features and fix some long-standing issues you didn't have the guts to touch. This can lead to feature creep and also extend the time needed for rewrite enormously.

Make sure you decide what exactly will be changed and what will only be rewritten - in advance.

Solution 5 - Architecture

I disagree with that article somewhat. For the most part Joel is correct but there are counter-examples that indicate sometimes (even if rarely) a rewrite is a good idea. E.g.,

  • Windows NT (Broke away from the old DOS code-base. Upon this foundation was built Win2k, WinXP and the upcoming Win7. Yes, Vista too. The last version of Windows on the old base was the infamous WinME)
  • Mac OS X (Rebuilt their flagship product on FreeBSD)
  • Many cases where a competitor displaces a de facto standard. (e.g., Excel vs. Lotus 123)

I believe Joel's argument is mainly based on fairly well-written code in the existing version that could be improved with hindsight. By all means, if the code you inherited is really that bad, push for a rewrite--there's some scary stuff out there. If it's at all tolerable and works reasonably well, phase in the new stuff at a slower pace.

Solution 6 - Architecture

I have been part of a small dedicated team that has rewritten code from scratch including reverse engineering business rules of the earlier code. The original application was web service written in C++ (with regular crashes and severe memory leaks) and a ASP.Net 1.0 web application and the replacement was a C# 2.0 asmx based web service and an ASP.Net 2.0 web application with Ajax. That said some of the things the team did and explained to management

  1. We supported the existing code base in production until the new code was ready.
  2. The management agreed that the rewrite (first release) would introduce no new features but just implement existing features. We added only 1-2 new features at the end.
  3. The small team was comprised of very experienced developers with excellent understand ability and cooperation.
  4. It was harder to get C++ talent in the organisation and C# was seen as a better alternative for future maintenance.
  5. We agreed to an aggressive timeframe but at the same time were confident and highly motivated to work in C# 2.0, ASP.Net 2.0 etc.
  6. We had a team leader to shield us from upper management and we followed scrum like process.

The project was highly successful. It was very stable and much better performing. Later it was easier to add new features. So I believe that code rewrite can be successfully done given right resource and circumstances.

Solution 7 - Architecture

Only one quasi-legitimate reason comes to mind: politics.

I've had to rewrite a codebase from scratch, and it had to do with politics. Basically, the previous coder who managed the codebase was too embarrassed to release the source code to the new team that had just been hired. She felt that every criticism of the code was a criticism of her as a person, and as a result, she only released code to the rest of us when she was forced. She is the only person with administrative access to the source repository, and whenever she's been asked to release all the source, she's threatened to quit and take all of her knowledge of the code and go home.

This codebase is over 15 years old, and has convolutions and contortions from various different people with various different styles. None of those styles apparently involved comments or specifications, at least, in the small portions she's released to us.

With only partial code available and a deadline, I was forced to do a total rewrite. I got yelled at as a result, because it was claimed that I caused a serious delay, but I just kept my head down and got it done rather than argue.

Politics can be a huge pain.

Solution 8 - Architecture

At some point, you have to cut your losses. If you've just inherited this code base, you might make changes that have unintended consequences, and due to the lack of tests, they'll be nearly impossible to find.

At the very least, start writing tests immediately.

Solution 9 - Architecture

I have been in precisely this situation but rather than a total rewrite I worked to change things through a refactoring process. The problem I ran into was the enormous complexity of the code I was working with- many pages of horrible, special-case-driven development all based on if-cases and convoluted regexes layered back over about ten years of unplanned growth and expansion.

My aim was to get it refactored function by function so that it would provide the same output for the same inputs but work much more cleanly and smoothly under the bonnet to facilitate future growth and improve performance. The general solution was clean and quick but the fixing job on the code just got more and more difficult and complicated as obscure special-cases in the documents being parsed by the system started to show themselves and my nice clean code would generate output that was just a little too different from what the original did ( this was web pages, so a different amount of whitespace could cause all kinds of layout problems on older IE versions ) in small and obscure ways.

I don't know if the reworked code ever got used- I left the company before it had the chance to be fully integrated- but I doubt it. Why use twenty lines of code when fifteen hundred 'if' statements and three-line regular expressions could do the same job?

Solution 10 - Architecture

Instead of a complete rewrite from scratch you want to start refactoring the code base in small steps while introducing unit tests. For example

  1. Move duplicate code into a common class with tests for resuse throughout the project

  2. Introduce interfaces to create separate testable modules. You can then refactor the implementation behind the interface while relying on your tests to ensure you don't break anything.

Solution 11 - Architecture

One danger in a complete rewrite is that your job is constantly on the line. You're a cost that isn't contributing to the bottom line. The code that sucks is the code that's making the money.

But if you fix the existing code one piece at a time, you're the guy who knows how the money machine works.

Solution 12 - Architecture

I would rather do things bit by bit, e.g., create a back-end to the database with a data model as you work in those areas (i.e., user login first, then user management, and so on), and tweak the existing front-end to use the new back-end (interface driven, so you can also add tests). This will keep the existing code with possible undocumented tweaks and behaviours that you wouldn't replicate by developing again from scratch, whilst adding in some separation of concerns.

After a while you will have migrated some 60% of the code base to use the new back-ends without the work being an official project, just maintenance, so you will be in a better position to argue for development time to do the other 40%, and once that is done the existing front-end classes will be vastly reduced in size and complexity. Once it is fully migrated, you will be able to reuse the new back-end model and controller components if you ever get the time to implement a new view.

Solution 13 - Architecture

My answer is: rewrite from scratch as often as possible.

I've spent most of my career inheriting steaming piles of dung we politely called "programs", written by young, inexperienced programmers who were considered "rock stars" by the managers. These things are generally unfixable, and you end up spending 10 times as much effort keeping them limping along as you would have spent just rewriting them from the ground up.

But I've also benefited tremendously by rewriting my own work periodically. Every rewrite is a chance to do things differently and potentially better, and you should be able to reuse at least some parts of the older version.

That being said, not all rewrites are a good idea. Windows Vista, for example.

Solution 14 - Architecture

Start by writing a technical spec. If the code is that awful, then I bet there isn't a real spec either. So write a comprehensive and detailed spec - you need to write a spec anyway if you want to rewrite from scratch, so the time is a good investment. Be careful to include all details about the functionality. Since you are able to investigate the actual behavior of the app, this should be easy. Feel free to include improvement suggestions, but be sure to capture all details of the current behavior.

As part of the investigation you might consider writing some automated tests of to system to investigate and document expected behavior. Focus on black-box/integration testing rather than unit-testing (which the code will probably not allow anyway if it is that ugly).

When you have this spec you will likely discover that the app is actually much more complex than your first impression, and reconsider rewriting from scratch. If you decide to gradually refactor instead, the spec and tests will help you a lot. But if you still decide to go forward and rewrite, then you have a good spec to work from now, and a suite of integration tests which will telly you when your work is complete.

Solution 15 - Architecture

I think this depends on two things:

  1. How flawed the underlying design of the legacy codebase,

  2. The time it would take to do a rewrite.

  3. The company I work for used to have a horribly designed codebase, which made the refactor really difficult because we could not refactor one bit at a time, the main problem was not with individual classes and functions but with the overall design. So the refactoring approach, would be very difficult. (If overall design was good, but, say, individual functions were 300 lines long and need breaking up, then refactoring makes sense).

  4. Despite a lot code and very convoluted, run processes. Our engine was not doing all that much. So the rewrite was not that long. Sometimes managers don't realize the that functionality of hundreds of thousands of lines of code can be rebuilt in very short time.

We tried to explain this to our CTO (small company), but he still thought rewrite would be to risky, so me and my co-worker rewrote the basic functionality of the engine in about four weekends. Then showed to our CTO and finally was convinced.

Now, if building basic functionality would take us six months we wouldn't have much on a argument.

Solution 16 - Architecture

There's an old adage that says:

> There's no such thing as bad code. There's only code that does what > you want and code that doesn't.

The key to knowing when to re-write lies in there. Does the system currently does what you want? If the answer is yes, slow, but steady improvements are your best bet. If the answer is no, a re-write is what you want.

Going back to Joel's essay, he talks about code that's messy, but software that is reliable and delivers the expected value. If instead, you have unreliable code full of major bugs and that wasn't covering all your use cases. You had things that were supposed to be there yet don't work, or are just missing. In this case, all the little hairs growing out of it aren't bug fixes, but cancer.

Solution 17 - Architecture

There is also a conflicting statement in economics that says,

> Never account for sunk costs

Sunk costs, according to Wikipedia (https://en.wikipedia.org/wiki/Sunk_cost):

> In economics and business decision-making, a sunk cost is a cost that has already been incurred and cannot be recovered.

When sunk costs are coupled with political pressure or personal ego (what manager wants to be the one to admit that they made a poor decision or didn't properly monitor results, even if it was unavoidable or out of their immediate control?), it leads to a situation called escalation of commitment (https://en.wikipedia.org/wiki/Escalation_of_commitment), which is defined as:

> a pattern of behavior in which an individual or group, when faced with increasingly negative outcomes from some decision, action, and investment, will continue rather than alter their course—something which is irrational, but in alignment with decisions and actions previously made.

How does this apply to code?

Having a rather long career as a software developer now, one common thread I've found is that, when faced with a challenging or ugly codebase (even if it is our own from two years ago), our first instinct is to want to throw out the old, ugly code and rewrite it from scratch. If it is a familiar codebase, then this is usually born from the fact that we are now much more familiar with the pitfalls of the project and business requirements than we were when we started the project, so we (perhaps subconsciously) yearn for the opportunity to fix our past sins by erasing them with perfection. If it is an unfamiliar codebase, we often tend to over-simplify the challenges faced by the original developers, glossing over "minor details" in favor of "big-picture" architectural-level thinking, and often blowing budgets and timeframes due to a lack of understanding of the complex minutia of the business cases that the code was originally meant to solve.

Then there is the whole concept of technical debt, which, just like financial debt, CAN and WILL accrue to the point that a codebase becomes technically insolvent. More and more time and resources are invested into troubleshooting bugs, extinguishing fires, and overly-challenging improvements to an extent that forward progress becomes expensive, difficult, and perilous. Projects take longer and longer due to defects and being pulled off of project work to fix production issues. After hours "incidents" start becoming expected operation instead of some rare blip. Instead of stepping back and starting to do things right to increase our future productivity (and quality of life), we find ourselves in a position where we are forced to add more and more technical debt in order to meet deadlines - the technical equivalent to taking cash advances on a credit card to make a minimum payment on another card.

That all being said, it neither means that we should rewrite whenever possible, nor should we avoid rewriting working code at all costs. Both extremes are potentially wasteful, and the latter does tend to lead to escalation of commitment (because at all costs means with total disregard to costs, even if those costs completely outstrip the benefits). What needs to occur is an objective assessment of the costs and benefits of rewriting code versus making incremental improvements. The challenge is finding someone with both the expertise and objectivity to make that decision properly. For us developers, we are generally biased towards rewriting because it tends to be a lot more interesting and engaging than working on some crappy legacy codebase. Business managers tend to be biased the other direction because a rewrite imposes some unknowns with little perceivable immediate benefit. The result is generally the absence of a real decision, which then defaults to continuing to dump hours into existing code until some circumstance necessitates a directional shift (or the developer covertly rewrites the code, and usually gets a spanking for it).

I've worked on codebases that were somewhat salvageable, albeit ugly. They didn't follow established practices or standards, didn't use patterns, weren't pretty, but they performed their intended functions reasonably well and were flexible enough that they could be modified to meet anticipated future needs for the expected life of the application. While not glamorous, it was perfectly acceptable to keep this code alive while making incremental improvements when the opportunity arose. Doing otherwise would have produced little benefit other than looking pretty. I would say that most code about which the should I rewrite this? question arises falls under this category, and I find myself explaining to the junior developers on the team that, while it would be great fun to rewrite YetAnotherLineOfBusinessApp in {insert whizzbang framework here}, it is neither necessary or desirable, and here are some ways we can improve it...

I've also worked on codebases that were hopeless. These were applications that barely launched in the first place, usually way behind schedule and in a reduced-functionality state. They were written in a way that no one but the original developer would have any chance of understanding what the code ultimately does. I refer to this as "read-only" code. Once it is written, any attempted change potentially results in systemic indecipherable failure of unknown origin, leading to panicked wholesale rewrites of massive monolithic code constructs that serve no purpose other than to educate the current developer on what is actually happening to a variable cleverly named obj_85 by the time execution reaches line 1,209 nested 7 levels deep in if... else..., switch, and foreach... statements somewhere in the DoEverythingAndMakeCoffee(...) method. Attempts to refactor this code results in failure. Every path you follow leads to another challenge, and more paths, and then paths that branch, and then circle back to a previous path, and after two weeks of heads-down refactoring of a single class you realize that, while maybe better encapsulated, the new code is nearly as whacky and obfuscated as the old code, probably contains even more bugs because the original intent of what you refactored was totally unclear, and, not knowing what exact business cases led to the original disaster in the first place, you can't be sure you've fully replicated the functionality. Progress is almost non-existent because translation of the codebase is nearly impossible and something so innocent is renaming a variable or using the proper type produces an exponential amount of unintended side effects.

Attempting to improve codebases like the above is an exercise in futility. Refactoring usually results in a 80% rewrite anyways, and the end result is nowhere near an 80% improvement. You end up with something that is very inconsistent, and the new code has a lot of compromises that had to be implemented in the interest of interoperability with legacy code (half of which was unnecessary because the legacy code that the new code needed to interoperate with later gets refactored out anyways). There are only two paths that can be followed... continue to accrue technical debt by hacking in "fixes" and modifications while hoping that the application is deprecated (or you get transferred to another project) before it collapses under its own weight, or someone makes the business decision and takes the risk of doing a complete rewrite. I hate both of these options, because it usually means waiting until something critical has failed or a project is way behind schedule, and you then spend the next three months of evenings and weekends trying to get something breathing that probably never should have been alive in the first place.

So, how do you decide?

  1. How well does the existing code work? Is it reliable and relatively defect free?

  2. Are the people on my team capable of understanding what this code does with a reasonable degree of effort? If I bring in an experienced developer, will he/she be able to make enough sense of this to become productive in a reasonable timeframe?

  3. Do what-should-be-simple defects take geologic time measurements to fix; so much so that we are unable to make real improvements or meet project deadlines?

  4. Is the codebase so fragile and the expected lifetime such that the ability of the application to adapt to future anticipated business needs is highly questionable?

  5. Did the existing code actually meet the original functionality requirements?

  6. Is your organization even receptive to investing in the application, or is someone (especially someone at a higher level on the org chart) going to be handed their own *ss for the problem?

  7. Can you provide financial or risk-based justification, backed up by hard facts, to make a business case for a rewrite?

  8. If, after a FULL accounting of the time and costs of a rewrite (including developing proper specifications, quality assurance testing, post-production stabilization, and training, does it still make sense to start rewriting code (us developers tend to only think in terms of coding time)?

  9. Do you have a choice? Is it even possible for the existing code to meet requirements (because if not, rewriting huge swaths is going to be part of the project and considered an "enhancement" instead of a rewrite)?

Solution 18 - Architecture

First, understand this is a vertical integration decision. Whether you replace a COBOL application with a .NET one, replace one API (or version) with another, decentralize a stored procedure into the SQL queries which consumed it, or refactor to extract an operation from functions, this is a decision about what operations to integrate in your system.

McKinsey's article "When and When Not to Vertically Integrate" explains a lot of useful things I won't repeat, because I don't completely agree with everything they say. https://www.mckinsey.com/business-functions/strategy-and-corporate-finance/our-insights/when-and-when-not-to-vertically-integrate

The best answer I've read for this question is, "Ask yourself does it compete." And I'm sorry I've lost that article, but this is your business decision. You can change it later. You should weigh things like the difficulty of working in and testing the code, especially how easily you can extend processes and add new processes -- this is horizontal and vertical growth, refer to HBR's 1978 article "How Should You Organize Manufacturing." My architecture has no equal in that area.

We have an ASPX application which I could rewrite in my architecture and MVC, but because ongoing changes to the application are very rare (less than yearly) and minor, other things are better use of my time. Changing interfaces can give users whiplash too and should be a last resort. I even avoided adding new fields to a web page because of the manual data entry work it would've created for users. Immediate control is the first thing people grab for, but it does not compete when continuous control is absent, the ability to exchange control, e.g. cruise control.

Stored procedures don't compete with spreadsheets, because users can understand the calculations and tell whether it filters out financial data, unlike the stored procedure I had to give management the bad news about. That said, no centralized or distributed process competes with an integrated one. Centralization costs controllability.

White papers I've found here and there say refactoring is most often done to centralize processes and very rarely to decentralize them. My architecture defines how to organize processing and thereby eliminates the ongoing need to refactor entirely. This is because it is organized as a manufacturing system, which can easily grow and easily replace steps regardless of process length. There's never anything left to extract.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJack MarchettiView Question on Stackoverflow
Solution 1 - ArchitectureMattView Answer on Stackoverflow
Solution 2 - ArchitecturecletusView Answer on Stackoverflow
Solution 3 - ArchitectureswamplordView Answer on Stackoverflow
Solution 4 - ArchitectureMilan BabuškovView Answer on Stackoverflow
Solution 5 - Architecturesteamer25View Answer on Stackoverflow
Solution 6 - ArchitecturesoftvedaView Answer on Stackoverflow
Solution 7 - ArchitecturemmrView Answer on Stackoverflow
Solution 8 - ArchitectureMatt GrandeView Answer on Stackoverflow
Solution 9 - ArchitectureglenatronView Answer on Stackoverflow
Solution 10 - ArchitectureMarkView Answer on Stackoverflow
Solution 11 - ArchitectureNosrednaView Answer on Stackoverflow
Solution 12 - ArchitectureJeeBeeView Answer on Stackoverflow
Solution 13 - ArchitectureMusiGenesisView Answer on Stackoverflow
Solution 14 - ArchitectureJacquesBView Answer on Stackoverflow
Solution 15 - ArchitectureAkavallView Answer on Stackoverflow
Solution 16 - ArchitectureDidier A.View Answer on Stackoverflow
Solution 17 - ArchitectureDVKView Answer on Stackoverflow
Solution 18 - ArchitectureRBJView Answer on Stackoverflow