When is a function too long?

FunctionRefactoringCoding Style

Function Problem Overview


35 lines, 55 lines, 100 lines, 300 lines? When you should start to break it apart? I'm asking because I have a function with 60 lines (including comments) and was thinking about breaking it apart.

long_function(){ ... }

into:

small_function_1(){...}
small_function_2(){...}
small_function_3(){...}

The functions are not going to be used outside of the long_function, making smaller functions means more function calls, etc.

When would you break apart a function into smaller ones? Why?

  1. Methods should do only one logical thing (think about functionality)
  2. You should be able to explain the method in a single sentence
  3. It should fit into the height of your display
  4. Avoid unnecessary overhead (comments that point out the obvious...)
  5. Unit testing is easier for small logical functions
  6. Check if part of the function can be reused by other classes or methods
  7. Avoid excessive inter-class coupling
  8. Avoid deeply nested control structures

Thanks everyone for the answers, edit the list and vote for the correct answer I'll choose that one ;)

I am refactoring now with those ideas in mind :)

Function Solutions


Solution 1 - Function

Here is a list of red-flags (in no particular order) that could indicate that a function is too long:

  1. Deeply nested control structures: e.g. for-loops 3 levels deep or even just 2 levels deep with nested if-statements that have complex conditions.

  2. Too many state-defining parameters: By state-defining parameter, I mean a function parameter that guarantees a particular execution path through the function. Get too many of these type of parameters and you have a combinatorial explosion of execution paths (this usually happens in tandem with #1).

  3. Logic that is duplicated in other methods: poor code re-use is a huge contributor to monolithic procedural code. A lot of such logic duplication can be very subtle, but once re-factored, the end result can be a far more elegant design.

  4. Excessive inter-class coupling: this lack of proper encapsulation results in functions being concerned with intimate characteristics of other classes, hence lengthening them.

  5. Unnecessary overhead: Comments that point out the obvious, deeply nested classes, superfluous getters and setters for private nested class variables, and unusually long function/variable names can all create syntactic noise within related functions that will ultimately increase their length.

  6. Your massive developer-grade display isn't big enough to display it: Actually, displays of today are big enough that a function that is anywhere close to its height is probably way too long. But, if it is larger, this is a smoking gun that something is wrong.

  7. You can't immediately determine the function's purpose: Furthermore, once you actually do determine its purpose, if you can't summarize this purpose in a single sentence or happen to have a tremendous headache, this should be a clue.

In conclusion, monolithic functions can have far-reaching consequences and are often a symptom of major design deficiencies. Whenever I encounter code that is an absolute joy to read, it's elegance is immediately apparent. And guess what: the functions are often very short in length.

Solution 2 - Function

There are no real hard and fast rules for it. Generally I like my methods to just "do one thing". So if it's grabbing data, then doing something with that data, then writing it to disk then I'd split out the grabbing and writing into separate methods so my "main" method just contains the "doing something".

That "doing something" could still be quite a few lines though, so I'm not sure a number of lines is the right metric to be using :)

Edit: This is a single line of code I mailed around work last week (to prove a point.. it's not something I make a habit of :)) - I certainly wouldn't want 50-60 of these bad boys in my method :D

return level4 != null ? GetResources().Where(r => (r.Level2 == (int)level2) && (r.Level3 == (int)level3) && (r.Level4 == (int)level4)).ToList() : level3 != null ? GetResources().Where(r => (r.Level2 == (int)level2) && (r.Level3 == (int)level3)).ToList() : level2 != null ? GetResources().Where(r => (r.Level2 == (int)level2)).ToList() : GetAllResourceList();

Solution 3 - Function

I think there is a huge caveat to the "do only one thing" mantra on this page. Sometimes doing one thing juggles lots of variables. Don't break up a long function into a bunch of smaller functions if the smaller functions end up having long parameter lists. Doing that just turns a single function into a set of highly coupled functions with no real individual value.

Solution 4 - Function

A function should do only one thing. If you are doing many small things in a function, make each small thing a function and call those functions from the long function.

What you really don't want to do is copy and paste every 10 lines of your long function into short functions (as your example suggests).

Solution 5 - Function

I agree a function should only do one thing, but at what level is that one thing.

If your 60 lines is accomplishing one thing (from your programs perspective) and the pieces that make up those 60 lines aren't going to be used by anything else then 60 lines is fine.

There is no real benefit to breaking it up, unless you can break it up into concrete pieces that stand on their own. The metric to use is functionality and not lines of code.

I have worked on many programs where the authors took the only one thing to an extreme level and all it ended up doing was to make it look like someone took a grenade to a function/method and blew it up into dozens of unconnected pieces that are hard to follow.

When pulling out pieces of that function you also need to consider if you will be adding any unnecessary overhead and avoid passing large amounts of data.

I believe the key point is to look for reuseability in that long function and pull those parts out. What you are left with is the function, whether it is 10, 20, or 60 lines long.

Solution 6 - Function

60 lines is large but not too long for a function. If it fits on one screen in an editor you can see it all at once. It really depends on what the functions is doing.

Why I may break up a function:

  • It is too long
  • It makes the code more maintainable by breaking it up and using meaningful names for the new function
  • The function is not cohesive
  • Parts of the function are useful in themselves.
  • When it is difficult to come up with a meaningful name for the function (It is probably doing too much)

Solution 7 - Function

My personal heuristic is that it's too long if I can't see the whole thing without scrolling.

Solution 8 - Function

Size approx you screen size (so go get a big pivot widescreen and turn it)... :-)

Joke aside, one logical thing per function.

And the positive thing is that unit testing is really much easier to do with small logical functions that do 1 thing. Big functions that do many things are harder to verify!

/Johan

Solution 9 - Function

Rule of thumb: If a function contains code blocks that do something, that is somewhat detached from the rest of code, put it in a seperate function. Example:

function build_address_list_for_zip($zip) {

    $query = "SELECT * FROM ADDRESS WHERE zip = $zip";
    $results = perform_query($query);
    $addresses = array();
    while ($address = fetch_query_result($results)) {
        $addresses[] = $address;
    }

    // now create a nice looking list of
    // addresses for the user
    return $html_content;
}

much nicer:

function fetch_addresses_for_zip($zip) {
    $query = "SELECT * FROM ADDRESS WHERE zip = $zip";
    $results = perform_query($query);
    $addresses = array();
    while ($address = fetch_query_result($results)) {
        $addresses[] = $address;
    }
    return $addresses;
}

function build_address_list_for_zip($zip) {

    $addresses = fetch_addresses_for_zip($zip);

    // now create a nice looking list of
    // addresses for the user
    return $html_content;
}

This approach has two advantages:

  1. Whenever you need to fetch addresses for a certain zip code you can use the readily available function.

  2. When you ever need to read the function build_address_list_for_zip() again you know what the first code block is going to do (it fetches addresses for a certain zip code, at least thats what you can derive from the function name). If you would have left the query code inline, you would first need to analyze that code.

[On the other hand (I will deny I told you this, even under torture): If you read a lot about PHP optimization, you could get the idea to keep the number of functions as small a possible, because function call are very, very expensive in PHP. I don't know about that since I never did any benchmarks. If thats case you would probably be better of not following any of the answers to your question if you application is very "performance sensitive" ;-) ]

Solution 10 - Function

Take a peek at McCabe's cyclomatic, in which he breaks up his code into a graph where, "Each node in the graph corresponds to a block of code in the program where the flow is sequential and the arcs correspond to branches taken in the program."

Now imagine your code has no functions/methods; its just one huge sprawl of code in the form of a graph.

You want to break this sprawl into methods. Consider that, when you do, there will be a certain number of blocks in each method. Only one block of each method will be visible to all other methods: the first block (we're presuming that you will be able to jump into a method at only one point: the first block). All the other blocks in each method will be information hidden within that method, but each block within a method may potentially jump to any other block within that method.

To determine what size your methods should be in terms of number of blocks per method, one question you might ask yourself is: how many methods should I have to minimise the maximum potential number of dependencies (MPE) between all blocks?

That answer is given by an equation. If r is the number of methods that minimises the MPE of the system, and n is the number of blocks in the system, then the equation is: r = sqrt(n)

And it can be shown that this gives the number of blocks per method to be, also, sqrt(n).

Solution 11 - Function

Bear in mind that you can end up re-factoring just for re-factoring's sake, potentially making the code more unreadable than it was in the first place.

A former colleague of mine had a bizarre rule that a function/method must only contain 4 lines of code! He tried to stick to this so rigidly that his method names often became repetitive & meaningless plus the calls became deeply nested & confusing.

So my own mantra has become: if you can't think of a decent function/method name for the chunk of code you are re-factoring, don't bother.

Solution 12 - Function

The main reason I usually break a function up is either because bits and pieces of it are also ingredients in another nearby function I'm writing, so the common parts get factored out. Also, if it's using a lot of fields or properties out of some other class, there's a good chance that the relevant chunk can be lifted out wholesale and if possible moved into the other class.

If you have a block of code with a comment at the top, consider pulling it out into a function, with the function and argument names illustrating its purpose, and reserving the comment for the code's rationale.

Are you sure there are no pieces in there that would be useful elsewhere? What sort of function is it?

Solution 13 - Function

I usually break functions up by the need to place comments describing the next code block. What previously went into the comments now goes into the new function name. This is no hard rule, but (for me) a nice rule of thumb. I like code speaking for itself better than one that needs comments (as I've learned that comments usually lie)

Solution 14 - Function

In my opinion the answer is: when it does too much things. Your function should perform only the actions you expect from the name of the function itself. Another thing to consider is if you want to reuse some parts of your functions in others; in this case it could be useful to split it.

Solution 15 - Function

This is partly a matter of taste, but how I determine this is I try to keep my functions roughly only as long as will fit on my screen at one time (at a maximum). The reason being that it's easier to understand what's happening if you can see the whole thing at once.

When I code, it's a mix of writing long functions, then refactoring to pull out bits that could be reused by other functions -and- writing small functions that do discrete tasks as I go.

I don't know that there is any right or wrong answer to this (e.g., you may settle on 67 lines as your max, but there may be times when it makes sense to add a few more).

Solution 16 - Function

There has been some thorough research done on this very topic, if you want the fewest bugs, your code shouldn't be too long. But it also shouldn't be too short.

I disagree that a method should fit on your display in one, but if you are scrolling down by more than a page then the method is too long.

See The Optimal Class Size for Object-Oriented Software for further discussion.

Solution 17 - Function

I have written 500 line functions before, however these were just big switch statements for decoding and responding to messages. When the code for a single message got more complex than a single if-then-else, I extracted it out.

In essence, although the function was 500 lines, the independently maintained regions averaged 5 lines.

Solution 18 - Function

I normally uses a test driven approach to writing code. In this approach the function size is often related to the granularity of your tests.

If your test is sufficiently focused then it will lead you to write a small focused function to make the test pass.

This also works in the other direction. Functions need to be small enough to test effectively. So when working with legacy code I often find that I break down larger functions in-order to test the different parts of them.

I usually ask myself "what is the responsibility of this function" and if I can't state the responsibility in a clear concise sentence, and then translate that into a small focused test, I wonder if the function is too big.

Solution 19 - Function

If it has more than three branches, generally this means that a function or method should be broken apart, to encapsulate the branching logic in different methods.

Each for loop, if statement, etc is then not seen as a branch in the calling method.

Cobertura for Java code (and I'm sure there are other tools for other languages) calculates the number of if, etc. in a function for each function and sums it for the "average cyclomatic complexity".

If a function/method only has three branches, it will get a three on that metric, which is very good.

Sometimes it is difficult to follow this guideline, namely for validating user input. Nevertheless putting branches in different methods aids not only development and maintenance but testing as well, since the inputs to the methods that perform the branching can be analyzed easily to see what inputs need to be added to the test cases in order to cover the branches that were not covered.

If all the branches were inside a single method, the inputs would have to be tracked since the start of the method, which hinders testability.

Solution 20 - Function

I suspect you'll find a lot of answers on this.

I would probably break it up based on the logical tasks that were being performed within the function. If it looks to you that your short story is turning into a novel, I would suggest find and extract distinct steps.

For example, if you have a function that handles some kind of string input and returns a string result, you might break the function up based on the logic to split your string into parts, the logic to add extra characters and the logic to put it all back together again as a formatted result.

In short, whatever makes your code clean and easy to read (whether that's by simply ensuring your function has good commenting or breaking it up) is the best approach.

Solution 21 - Function

assuming that you are doing one thing, the length will depend on:

  • what you are doing
  • what language you are using
  • how many levels of abstraction you need to deal with in the code

60 lines might be too long or it might be just right. i suspect that it may be too long though.

Solution 22 - Function

One thing (and that thing should be obvious from the function name), but no more than a screenful of code, regardless. And feel free to increase your font size. And if in doubt, refactor it into two or more functions.

Solution 23 - Function

Extending the spirit of a tweet from Uncle Bob a while back, you know a function is getting too long when you feel the need to put a blank line between two lines of code. The idea being that if you need a blank line to separate the code, its responsibility and scope are separating at that point.

Solution 24 - Function

My idea is that if I have to ask myself if it is too long, it is probably too long. It helps making smaller functions, in this area, because it could help later in the application's life cycle.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser58163View Question on Stackoverflow
Solution 1 - FunctionRyan DelucchiView Answer on Stackoverflow
Solution 2 - FunctionSteven RobbinsView Answer on Stackoverflow
Solution 3 - FunctionjmucchielloView Answer on Stackoverflow
Solution 4 - FunctionPaige RutenView Answer on Stackoverflow
Solution 5 - FunctionbruceatkView Answer on Stackoverflow
Solution 6 - FunctiondajoradView Answer on Stackoverflow
Solution 7 - FunctionFerruccioView Answer on Stackoverflow
Solution 8 - FunctionJohanView Answer on Stackoverflow
Solution 9 - FunctionEricSchaeferView Answer on Stackoverflow
Solution 10 - FunctionEd KirwanView Answer on Stackoverflow
Solution 11 - FunctionSaltmeisterView Answer on Stackoverflow
Solution 12 - FunctionJeffrey HantinView Answer on Stackoverflow
Solution 13 - FunctionOlaf KockView Answer on Stackoverflow
Solution 14 - FunctionPierpaoloView Answer on Stackoverflow
Solution 15 - FunctionAndrew HedgesView Answer on Stackoverflow
Solution 16 - FunctionIan HickmanView Answer on Stackoverflow
Solution 17 - FunctionJoshuaView Answer on Stackoverflow
Solution 18 - FunctionlexicalscopeView Answer on Stackoverflow
Solution 19 - FunctionMetroidFan2002View Answer on Stackoverflow
Solution 20 - FunctionPhil.WheelerView Answer on Stackoverflow
Solution 21 - FunctionRay TayekView Answer on Stackoverflow
Solution 22 - FunctiongkrogersView Answer on Stackoverflow
Solution 23 - FunctionMark BostlemanView Answer on Stackoverflow
Solution 24 - FunctionsokketView Answer on Stackoverflow