Is violation of DRY principle always bad?

Design PatternsRefactoring

Design Patterns Problem Overview


I have been discussing about DRY (Don't Repeat Yourself) principle also known as DIE (Duplication Is Evil) and there are votes, that any simple code repetition is always an evil. I would like to hear your opinion about the following points:

  1. Uncertain future. Let's say, that we have the same code in two places. The key is, that these two places have only incidental connotation. There is a possibility, that they will vary in the future because their context and semantics are different. Making an abstraction from these places is not cheap and if one of these places change, unwrapping from abstraction will be even more expensive.
  2. Readability. There is a complex computation that involve several variables or steps. In other place of code there is another one, that have some parts identical. The problem is, that if we take out the common parts, the readability of calculation will decrease and created abstraction will be very hard to give it a descriptive name. Worse, if some part of algorithm will change in the future like in point 1.

Does the above cases are good reason to give up abstraction process and just leave duplicated code in favor of risk of future changes or just readability?

Design Patterns Solutions


Solution 1 - Design Patterns

Those are entirely valid reasons to violate DRY. I should add a third: performance. It's rarely a big deal, but it can make a difference, and abstraction can risk slowing things down.

Actually, I'll add a fourth: wasting time and potentially introducing new bugs by changing two (or more) parts of a codebase that might already be working just fine. Is it worth the cost of figuring out how to abstract these things if you don't have to and it probably won't save any or much time in the future?

Typically, duplicated code is not ideal, but there are certainly compelling reasons to allow it, probably including further reasons than what the OP and myself have suggested.

Solution 2 - Design Patterns

Yes, certain code duplications are notoriously difficult to factor out without making the readability significantly worse. In such situations I leave a TODO in comments as a reminder that there is some duplication but at the time of writing it seemed better to leave it like that.

Usually what happens is what you write in your first point, the duplications diverge and are no longer duplications. It also happens that the duplication is a sign of a design issue but it only becomes clear later.

Long story short: try to avoid duplication; if the duplication is notoriously difficult to factor out and at the time of writing harmless, just leave a comment as a reminder.


See also 97 Things Every Programmer Should Know:

p. 14. Beware the Share by Udi Dahan

> The fact that two wildly different parts of the system performed some logic > in the same way meant less than I thought. Up until I had pulled out those > libraries of shared code, these parts were not dependent on each other. Each > could evolve independently. Each could change its logic to suit the needs of the > system’s changing business environment. Those four lines of similar code were > accidental—a temporal anomaly, a coincidence.

In that case, he created dependece between two parts of the system that were better kept independent. The solution was essentially duplication.

Solution 3 - Design Patterns

Let's try to understand why DRY is important, and then we can understand where breaking the rule is reasonable:

DRY should be used to avoid the situation where two pieces of code are conceptually doing some of the same work, so whenever you change the code in one place you have to change the code in the other place. If the same logic is in two separate places, then you have to always remember to change the logic in both places, which can be quite error prone. This can apply at any scale. It can be an entire application that is being duplicated or it can be a single constant value. There also may not be any repeated code at all, it may just be a repeated principle. You have to ask "If I were to make a change in one place, would I necessarily need to make an equivalent change somewhere else?". If the answer is "yes", then the code is violating DRY.

Imagine that you have a line like this in your program:

cost = price + price*0.10 // account for sales tax

and somewhere else in your program, you have a similar line:

x = base_price*1.1; // account for sales tax

If the sales tax changes, you are going to need to change both of those lines. There is almost no repeated code here, but the fact that if you make a change in one place it requires a change in another place is what makes the code not DRY. What's more, it may be very difficult to realize that you have to make the change in two places. Maybe your unit tests will catch it, but maybe not, so getting rid of the duplication is important. Maybe you would factor the value of the sales tax into a separate constant that can be used in multiple places:

cost = price + price*sales_tax;
x = base_price*(1.0+sales_tax);

or maybe create a function to abstract it even more:

cost = costWithTax(price);
x = costWithTax(base_price);

Either way, it is very likely to be worth the trouble.

Alternatively, you may have code that looks very similar but isn't violating DRY:

x = base_price * 1.1; // add 10% markup for premium service

If you were to change the way sales tax is calculated, you wouldn't want to change that line of code, so it isn't actually repeating any logic.

There are also cases where having to make the same change in multiple places is okay. For example, maybe you have code like this:

a0 = f(0);
a1 = f(1);

This code isn't DRY in a few ways. For example, if you were to change the name of function f, you would have to change two places. You could perhaps make the code more DRY by creating a small loop and turning a into an array. However, this particular duplication isn't a big deal. First, the two changes are very close together, so accidentally changing one without changing the other is unlikely. Second, if you are in a compiled language, then the compiler will most likely catch the problem anyway. If you are not in a compiled language, then hopefully your unit tests will catch it.

There are many good reasons to make your code DRY, but there are many good reasons not to also.

Solution 4 - Design Patterns

Engineering is all about trade-offs, so there's no definitive advice or design pattern that is valid for every problem. Some decisions are harder to support than others (code repetition is one of them), but if the pros of repeating code outweighs its cons in your situation, go for it.

Solution 5 - Design Patterns

There are no absolutes, it is always going to be a judgement call between the lesser of two evils. Usually, DRY wins and you have to be careful of slippery slopes when you start to violate it, but your reasoning seems fine to me.

Solution 6 - Design Patterns

For an excellent response to this question, please refer to "The Pragmatic Programmer" by Thomas, Hunt (It was Dave Thomas who came up with the term 'Dry' in the first place)

In short, there is no easy answer, it's almost always better to remain dry, but if it improves readability then you should use your best judgement, It's your call!

Solution 7 - Design Patterns

No, violation of DRY isn't always bad. Especially, if you fail to come up with a good name for an abstraction of the duplicated code, i.e. a name that suits both contexts, it might be that they're different things after all, and should be left duplicated.

In my experience this kind of coincidence tends to be rare though, and the larger the duplicated code, the most likely it is to describe one single concept.

I also find abstracting to composition is almost always a better idea in that regard than abstracting to inheritance which can easily lead you to false equations and LSP and ISP violations.

Solution 8 - Design Patterns

I believe yes. Although as a general rule DRY is ideal, there are times when it is better to simply repeat yourself. I find myself disregarding DRY often when in the pre-development testing phase. You never know when you will need to make slight changes to a function, that you do not want to make in another. I of course try to always observe DRY on "finished" (applications that are completed, and will NOT ever need to be modified) applications, but those are few and far between. In the end it depends on the applications futures needs. I've done applications I wished was DRY, and I've thanked God I didn't observe it on others.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRyszard DżeganView Question on Stackoverflow
Solution 1 - Design PatternspattivacekView Answer on Stackoverflow
Solution 2 - Design PatternsAliView Answer on Stackoverflow
Solution 3 - Design PatternsVaughn CatoView Answer on Stackoverflow
Solution 4 - Design PatternsDaniel MartínView Answer on Stackoverflow
Solution 5 - Design PatternsjlewView Answer on Stackoverflow
Solution 6 - Design Patternsilan berciView Answer on Stackoverflow
Solution 7 - Design Patternsguillaume31View Answer on Stackoverflow
Solution 8 - Design PatternsRobert DickeyView Answer on Stackoverflow