When NOT to use yield (return)

C#.NetYieldYield Return

C# Problem Overview


> This question already has an answer here:
> Is there ever a reason to not use 'yield return' when returning an IEnumerable?

There are several useful questions here on SO about the benefits of yield return. For example,

I'm looking for thoughts on when NOT to use yield return. For example, if I expect to need to return all items in a collection, it doesn't seem like yield would be useful, right?

What are the cases where use of yield will be limiting, unnecessary, get me into trouble, or otherwise should be avoided?

C# Solutions


Solution 1 - C#

> What are the cases where use of yield will be limiting, unnecessary, get me into trouble, or otherwise should be avoided?

It's a good idea to think carefully about your use of "yield return" when dealing with recursively defined structures. For example, I often see this:

public static IEnumerable<T> PreorderTraversal<T>(Tree<T> root)
{
    if (root == null) yield break;
    yield return root.Value;
    foreach(T item in PreorderTraversal(root.Left))
        yield return item;
    foreach(T item in PreorderTraversal(root.Right))
        yield return item;
}

Perfectly sensible-looking code, but it has performance problems. Suppose the tree is h deep. Then there will at most points be O(h) nested iterators built. Calling "MoveNext" on the outer iterator will then make O(h) nested calls to MoveNext. Since it does this O(n) times for a tree with n items, that makes the algorithm O(hn). And since the height of a binary tree is lg n <= h <= n, that means that the algorithm is at best O(n lg n) and at worst O(n^2) in time, and best case O(lg n) and worse case O(n) in stack space. It is O(h) in heap space because each enumerator is allocated on the heap. (On implementations of C# I'm aware of; a conforming implementation might have other stack or heap space characteristics.)

But iterating a tree can be O(n) in time and O(1) in stack space. You can write this instead like:

public static IEnumerable<T> PreorderTraversal<T>(Tree<T> root)
{
    var stack = new Stack<Tree<T>>();
    stack.Push(root);
    while (stack.Count != 0)
    {
        var current = stack.Pop();
        if (current == null) continue;
        yield return current.Value;
        stack.Push(current.Left);
        stack.Push(current.Right);
    }
}

which still uses yield return, but is much smarter about it. Now we are O(n) in time and O(h) in heap space, and O(1) in stack space.

Further reading: see Wes Dyer's article on the subject:

http://blogs.msdn.com/b/wesdyer/archive/2007/03/23/all-about-iterators.aspx

Solution 2 - C#

> What are the cases where use of yield > will be limiting, unnecessary, get me > into trouble, or otherwise should be > avoided?

I can think of a couple of cases, IE:

  • Avoid using yield return when you return an existing iterator. Example:

     // Don't do this, it creates overhead for no reason
     // (a new state machine needs to be generated)
     public IEnumerable<string> GetKeys() 
     {
         foreach(string key in _someDictionary.Keys)
             yield return key;
     }
     // DO this
     public IEnumerable<string> GetKeys() 
     {
         return _someDictionary.Keys;
     }
    
  • Avoid using yield return when you don't want to defer execution code for the method. Example:

     // Don't do this, the exception won't get thrown until the iterator is
     // iterated, which can be very far away from this method invocation
     public IEnumerable<string> Foo(Bar baz) 
     {
         if (baz == null)
             throw new ArgumentNullException();
          yield ...
     }
     // DO this
     public IEnumerable<string> Foo(Bar baz) 
     {
         if (baz == null)
             throw new ArgumentNullException();
          return new BazIterator(baz);
     }
    

Solution 3 - C#

The key thing to realize is what yield is useful for, then you can decide which cases do not benefit from it.

In other words, when you do not need a sequence to be lazily evaluated you can skip the use of yield. When would that be? It would be when you do not mind immediately having your entire collection in memory. Otherwise, if you have a huge sequence that would negatively impact memory, you would want to use yield to work on it step by step (i.e., lazily). A profiler might come in handy when comparing both approaches.

Notice how most LINQ statements return an IEnumerable<T>. This allows us to continually string different LINQ operations together without negatively impacting performance at each step (aka deferred execution). The alternative picture would be putting a ToList() call in between each LINQ statement. This would cause each preceding LINQ statement to be immediately executed before performing the next (chained) LINQ statement, thereby forgoing any benefit of lazy evaluation and utilizing the IEnumerable<T> till needed.

Solution 4 - C#

There are a lot of excellent answers here. I would add this one: Don't use yield return for small or empty collections where you already know the values:

IEnumerable<UserRight> GetSuperUserRights() {
    if(SuperUsersAllowed) {
        yield return UserRight.Add;
        yield return UserRight.Edit;
        yield return UserRight.Remove;
    }
}

In these cases the creation of the Enumerator object is more expensive, and more verbose, than just generating a data structure.

IEnumerable<UserRight> GetSuperUserRights() {
    return SuperUsersAllowed
           ? new[] {UserRight.Add, UserRight.Edit, UserRight.Remove}
           : Enumerable.Empty<UserRight>();
}

##Update##

Here's the results of my benchmark:

Benchmark Results

These results show how long it took (in milliseconds) to perform the operation 1,000,000 times. Smaller numbers are better.

In revisiting this, the performance difference isn't significant enough to worry about, so you should go with whatever is the easiest to read and maintain.

##Update 2##

I'm pretty sure the above results were achieved with compiler optimization disabled. Running in Release mode with a modern compiler, it appears performance is practically indistinguishable between the two. Go with whatever is most readable to you.

Solution 5 - C#

Eric Lippert raises a good point (too bad C# doesn't have stream flattening like Cw). I would add that sometimes the enumeration process is expensive for other reasons, and therefore you should use a list if you intend to iterate over the IEnumerable more than once.

For example, LINQ-to-objects is built on "yield return". If you've written a slow LINQ query (e.g. that filters a large list into a small list, or that does sorting and grouping), it may be wise to call ToList() on the result of the query in order to avoid enumerating multiple times (which actually executes the query multiple times).

If you are choosing between "yield return" and List<T> when writing a method, consider: is each single element expensive to compute, and will the caller need to enumerate the results more than once? If you know the answers are yes and yes, you shouldn't use yield return (unless, for example, the List produced is very large and you can't afford the memory it would use. Remember, another benefit of yield is that the result list doesn't have to be entirely in memory at once).

Another reason not to use "yield return" is if interleaving operations is dangerous. For example, if your method looks something like this,

IEnumerable<T> GetMyStuff() {
    foreach (var x in MyCollection)
        if (...)
            yield return (...);
}

this is dangerous if there is a chance that MyCollection will change because of something the caller does:

foreach(T x in GetMyStuff()) {
    if (...)
        MyCollection.Add(...);
        // Oops, now GetMyStuff() will throw an exception
        // because MyCollection was modified.
}

yield return can cause trouble whenever the caller changes something that the yielding function assumes does not change.

Solution 6 - C#

Yield would be limiting/unnecessary when you need random access. If you need to access element 0 then element 99, you've pretty much eliminated the usefulness of lazy evaluation.

Solution 7 - C#

I would avoid using yield return if the method has a side effect that you expect on calling the method. This is due to the deferred execution that Pop Catalin mentions.

One side effect could be modifying the system, which could happen in a method like IEnumerable<Foo> SetAllFoosToCompleteAndGetAllFoos(), which breaks the single responsibility principle. That's pretty obvious (now...), but a not so obvious side effect could be setting a cached result or similar as an optimisation.

My rules of thumb (again, now...) are:

  • Only use yield if the object being returned requires a bit of processing
  • No side effects in the method if I need to use yield
  • If have to have side effects (and limiting that to caching etc), don't use yield and make sure the benefits of expanding the iteration outweigh the costs

Solution 8 - C#

One that might catch you out is if you are serialising the results of an enumeration and sending them over the wire. Because the execution is deferred until the results are needed, you will serialise an empty enumeration and send that back instead of the results you want.

Solution 9 - C#

I have to maintain a pile of code from a guy who was absolutely obsessed with yield return and IEnumerable. The problem is that a lot of third party APIs we use, as well as a lot of our own code, depend on Lists or Arrays. So I end up having to do:

IEnumerable<foo> myFoos = getSomeFoos();
List<foo> fooList = new List<foo>(myFoos);
thirdPartyApi.DoStuffWithArray(fooList.ToArray());

Not necessarily bad, but kind of annoying to deal with, and on a few occasions it's led to creating duplicate Lists in memory to avoid refactoring everything.

Solution 10 - C#

When you don't want a code block to return an iterator for sequential access to an underlying collection, you dont need yield return. You simply return the collection then.

Solution 11 - C#

If you're defining a Linq-y extension method where you're wrapping actual Linq members, those members will more often than not return an iterator. Yielding through that iterator yourself is unnecessary.

Beyond that, you can't really get into much trouble using yield to define a "streaming" enumerable that is evaluated on a JIT basis.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLawrence P. KelleyView Question on Stackoverflow
Solution 1 - C#Eric LippertView Answer on Stackoverflow
Solution 2 - C#Pop CatalinView Answer on Stackoverflow
Solution 3 - C#Ahmad MageedView Answer on Stackoverflow
Solution 4 - C#StriplingWarriorView Answer on Stackoverflow
Solution 5 - C#QwertieView Answer on Stackoverflow
Solution 6 - C#Robert GowlandView Answer on Stackoverflow
Solution 7 - C#Rebecca ScottView Answer on Stackoverflow
Solution 8 - C#AidanView Answer on Stackoverflow
Solution 9 - C#Mike RuhlinView Answer on Stackoverflow
Solution 10 - C#explorerView Answer on Stackoverflow
Solution 11 - C#KeithSView Answer on Stackoverflow