What is refactoring and what is only modifying code?

Refactoring

Refactoring Problem Overview


I know that refactoring is "changing the structure of a program so that the functionality is not changed". I was talking with some of the guys I'm working with on my final year project at university and I was surprised that they have a much more expansive (for want of a better word) view of refactoring.

I consider refactoring to be things like extracting methods and renaming classes. They also suggested things like changing data structures (like a Java LinkedList to an ArrayList), changing algorithms (using merge sort instead of bubble sort), and even rewriting large chunks of code as refactoring.

I was quite sure that they were wrong, but I wasn't able to give a good reason why because what they were suggesting did change the program (and presumably make it better) without changing its behaviour. Am I right, and more importantly, why?

Refactoring Solutions


Solution 1 - Refactoring

Martin Fowler's "Refactoring: Improving the Design of Existing Code" is perhaps THE reference:

> Refactoring is a controlled technique > for improving the design of an > existing code base. Its essence is > applying a series of small > behavior-preserving transformations, > each of which "too small to be worth > doing". However the cumulative effect > of each of these transformations is > quite significant. By doing them in > small steps you reduce the risk of > introducing errors. You also avoid > having the system broken while you are > carrying out the restructuring - which > allows you to gradually refactor a > system over an extended period of > time.

Refactoring goes hand-in-hand with unit testing. Write tests before you refactor and then you have a confidence level in the refactoring (proportional to the coverage of the tests).

A good reference is: Information about Refactoring

Solution 2 - Refactoring

Fowler draws a clean line between changes to code that do, and those that do not, affect its behavior. He calls those that do not, "refactoring". This is an important distinction, because if we divide our work into refactoring and non-refactoring code modification activities (Fowler calls it "wearing different hats"), we can apply different, goal-appropriate techniques.

If we are making a refactoring, or behavior-preserving code modification:

  • all our unit tests should pass before and after the modification
  • we should not need to modify any tests, or write any new ones
  • we expect cleaner code when we are done
  • we do not expect new behavior

If we are making a behavior-changing code modification:

  • we expect new behavior
  • we should write new tests
  • we may get dirtier code when we are done (and should then refactor it)

If we lose sight of this distinction, then our expectations for any given code modification task are muddled and complex, or at any rate more muddled and more complex than if we are mindful of it. That is why the word and its meaning are important.

Solution 3 - Refactoring

To give my view:

Small, incremental changes that leave the code in a better state than it was found

Definitely Yes: "Cosmetic" changes that are not directly related to features (i.e. it's not billable as a change request).

Definitely No: Rewriting large chunks clearly violates the "small, incremental" part. Refactoring is often used as the opposite of a rewrite: instead of doing it again, improve the existing.

Definitely Maybe: Replacing data structures and algorithms is somewhat of a border case. The deciding difference here IMO is the small steps: be ready to to deliver, be ready to work on another case.


Example: Imagine you have a Report Randomizer Module that is slowed down by it's use of a vector. You've profiled that vector insertions are the bottleneck, but unfortunately the module relies on contigous memory in many places so that when using a list, things would break silently.

Rewriting would mean throwing the Module away an building a better and faster one from scratch, just picking some pieces from the old one. Or writing a new core, then fitting it into the existing dialog.

Refactoring would mean to take small steps to remove the pointer arithmetics, so that the switch. Maybe you even create a utility function wrapping the pointer arithmetics, replacing direct pointer manipulation with calls to that function, then switch to an iterator so that the compiler complains about places where pointer arithmetics is still used, then switch to a list, and then remove the ultility function.


The idea behind is that code gets worse on its own. When fixing bugs and adding features, quality decays in small steps - the meaning of a variable subtly changes, a functions gets an additional parameter that breaks isolation, a loop gets a bit to complex etc. None of these is a real bug, you can't tell a line count that makes the loop to complex, but you hurt readability and maintenance.

Similarly, changing a variable name or extracting a function, aren't tangible improvements of their own. But alltogether, they fight the slow erosion.

Like a wall of pebbles where everyday one falls to the ground. And everyday, one passerby picks it up and puts it back.

Solution 4 - Refactoring

With Martin Fowler's definition in mind,

> Refactoring is a disciplined technique > for restructuring an existing body of > code, altering its internal structure > without changing its external behavior.

... I think you are clearly right.

> They also suggested things like > changing data structures (like a Java > LinkedList to an ArrayList), changing > algorithms (using merge sort instead > of bubble sort), and even rewriting > large chunks of code as refactoring.

Changing an algorithm to something much faster is obviously not refactoring, because external behaviour is changed! (Then again, if the effect is never noticeable, perhaps you could call it refactoring after all - and also premature optimisation. :-)

This is a pet peeve of mine; it's annoying when people use the term sloppily - I've even come across some who might casually use refactoring for basically any kind of change or fix. Yeah, it's a hip and cool buzzword and all, but there's nothing wrong with plain old terms like change, rewrite or performance improvement. We should use those when appropriate, and reserve refactoring for cases when you are truly just improving the internal structure of your software. Within a development team, especially, having a common language for accurately discussing your work does matter.

Solution 5 - Refactoring

If the interface to a piece of code changes, then I consider that more than refactoring.

The typical case for refactoring is

  • "Oh, all my unit tests run, but I think my code could be made cleaner"
  • Change code to be more readable / clearer / efficient
  • Re-run unit tests (without changing the tests) and check they still work

This means that the term refactoring is relative to the interface you are discussing. i.e you could be refactoring the code behind one interface, while more extensively changing the code of another at a lower level (maybe that distinction is what's providing the confusion between you and your colleagues here?)

Solution 6 - Refactoring

I think you are right, but arguing over the meaning of a word is not particularly interesting or productive.

Solution 7 - Refactoring

http://en.wikipedia.org/wiki/Code_refactoring

> Code refactoring is > the process of changing a computer > program's internal structure without > modifying its external functional > behavior or existing functionality, in > order to improve internal > non-functional properties of the > software, for example to improve code > readability, to simplify code > structure, to change code to adhere to > a given programming paradigm, to > improve maintainability, to improve > performance, or to improve > extensibility.

I agree that refactoring code does include breaking existing code. Just make sure that you have unit tests so that you do not introduce any bugs, and the rest of the code compiles. Using refactoring tools like Resharper for C# makes this so easy!

  • Making code more understandable
  • Cleaning up code and making it tidier
  • Removing code! Redundant, unused code and comments should be deleted
  • Improving performance
  • Making something more generic. Start with the simplest thing possible, then refactor it out to make it easier to test / isolate or generic so it can work in different ways through polymorphism
  • Keeping code DRY - Don't repeat yourself, so a refactoring session might involve taking some repeated code and refactoring it into a single component / class / module.

Solution 8 - Refactoring

I disagree:

> In software engineering, "refactoring" > source code means improving it without > changing its overall results [...]

You already know the more precise terms used for subsets of refactoring, and yes, it's a very general term.

Solution 9 - Refactoring

I think that nobody can benefit of a too strong definition of the term 'refactoring'. The border between how you do perceive it and your colleagues is blurry and can be closer to their or your view depending on many facts. Since it's dynamic let's try to define it. First of all define the boundaries of the system or subsystem that you are trying to refactor.

If it is a method keep the name, input arguments, type of the returned value and possibly throwing statements fixed. Apply all the changes inside the method without changing how it is viewed outside.

If you refactor a class fix its public API and using rename variables, extract methods and all the other available techniques change the class to be more readable and/or more performant.

If the part of the code that you are refactoring is a package or a module do the refactoring inside of it possibly renaming classes, deleting, introducing interfaces, pushing/pulling code into super/subclasses.

Solution 10 - Refactoring

Refactoring = improving non functional requirements while keeping unaltered the functional ones.

Non functional requirements = modularity, testability, maintainability ,readability, separation of concerns, liskov principles and so on...

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDavid JohnstoneView Question on Stackoverflow
Solution 1 - RefactoringMitch WheatView Answer on Stackoverflow
Solution 2 - RefactoringCarl ManasterView Answer on Stackoverflow
Solution 3 - RefactoringpeterchenView Answer on Stackoverflow
Solution 4 - RefactoringJonikView Answer on Stackoverflow
Solution 5 - RefactoringDanSingermanView Answer on Stackoverflow
Solution 6 - RefactoringcbpView Answer on Stackoverflow
Solution 7 - RefactoringsuperlogicalView Answer on Stackoverflow
Solution 8 - Refactoringl0b0View Answer on Stackoverflow
Solution 9 - RefactoringBoris PavlovićView Answer on Stackoverflow
Solution 10 - RefactoringEnzoViciView Answer on Stackoverflow