C# 5 async CTP: why is internal "state" set to 0 in generated code before EndAwait call?

C#AsynchronousC# 5.0

C# Problem Overview


Yesterday I was giving a talk about the new C# "async" feature, in particular delving into what the generated code looked like, and the GetAwaiter() / BeginAwait() / EndAwait() calls.

We looked in some detail at the state machine generated by the C# compiler, and there were two aspects we couldn't understand:

  • Why the generated class contains a Dispose() method and a $__disposing variable, which never appear to be used (and the class doesn't implement IDisposable).
  • Why the internal state variable is set to 0 before any call to EndAwait(), when 0 normally appears to mean "this is the initial entry point".

I suspect the first point could be answered by doing something more interesting within the async method, although if anyone has any further information I'd be glad to hear it. This question is more about the second point, however.

Here's a very simple piece of sample code:

using System.Threading.Tasks;

class Test
{
    static async Task<int> Sum(Task<int> t1, Task<int> t2)
    {
        return await t1 + await t2;
    }
}

... and here's the code which gets generated for the MoveNext() method which implements the state machine. This is copied directly from Reflector - I haven't fixed up the unspeakable variable names:

public void MoveNext()
{
    try
    {
        this.$__doFinallyBodies = true;
        switch (this.<>1__state)
        {
            case 1:
                break;

            case 2:
                goto Label_00DA;

            case -1:
                return;

            default:
                this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
                this.<>1__state = 1;
                this.$__doFinallyBodies = false;
                if (this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate))
                {
                    return;
                }
                this.$__doFinallyBodies = true;
                break;
        }
        this.<>1__state = 0;
        this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
        this.<a2>t__$await4 = this.t2.GetAwaiter<int>();
        this.<>1__state = 2;
        this.$__doFinallyBodies = false;
        if (this.<a2>t__$await4.BeginAwait(this.MoveNextDelegate))
        {
            return;
        }
        this.$__doFinallyBodies = true;
    Label_00DA:
        this.<>1__state = 0;
        this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();
        this.<>1__state = -1;
        this.$builder.SetResult(this.<1>t__$await1 + this.<2>t__$await3);
    }
    catch (Exception exception)
    {
        this.<>1__state = -1;
        this.$builder.SetException(exception);
    }
}

It's long, but the important lines for this question are these:

// End of awaiting t1
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

// End of awaiting t2
this.<>1__state = 0;
this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();

In both cases the state is changed again afterwards before it's next obviously observed... so why set it to 0 at all? If MoveNext() were called again at this point (either directly or via Dispose) it would effectively start the async method again, which would be wholly inappropriate as far as I can tell... if and MoveNext() isn't called, the change in state is irrelevant.

Is this simply a side-effect of the compiler reusing iterator block generation code for async, where it may have a more obvious explanation?

Important disclaimer

Obviously this is just a CTP compiler. I fully expect things to change before the final release - and possibly even before the next CTP release. This question is in no way trying to claim this is a flaw in the C# compiler or anything like that. I'm just trying to work out whether there's a subtle reason for this that I've missed :)

C# Solutions


Solution 1 - C#

Okay, I finally have a real answer. I sort of worked it out on my own, but only after Lucian Wischik from the VB part of the team confirmed that there really is a good reason for it. Many thanks to him - and please visit his blog (on archive.org), which rocks.

The value 0 here is only special because it's not a valid state which you might be in just before the await in a normal case. In particular, it's not a state which the state machine may end up testing for elsewhere. I believe that using any non-positive value would work just as well: -1 isn't used for this as it's logically incorrect, as -1 normally means "finished". I could argue that we're giving an extra meaning to state 0 at the moment, but ultimately it doesn't really matter. The point of this question was finding out why the state is being set at all.

The value is relevant if the await ends in an exception which is caught. We can end up coming back to the same await statement again, but we mustn't be in the state meaning "I'm just about to come back from that await" as otherwise all kinds of code would be skipped. It's simplest to show this with an example. Note that I'm now using the second CTP, so the generated code is slightly different to that in the question.

Here's the async method:

static async Task<int> FooAsync()
{
    var t = new SimpleAwaitable();
    
    for (int i = 0; i < 3; i++)
    {
        try
        {
            Console.WriteLine("In Try");
            return await t;
        }                
        catch (Exception)
        {
            Console.WriteLine("Trying again...");
        }
    }
    return 0;
}

Conceptually, the SimpleAwaitable can be any awaitable - maybe a task, maybe something else. For the purposes of my tests, it always returns false for IsCompleted, and throws an exception in GetResult.

Here's the generated code for MoveNext:

public void MoveNext()
{
    int returnValue;
    try
    {
        int num3 = state;
        if (num3 == 1)
        {
            goto Label_ContinuationPoint;
        }
        if (state == -1)
        {
            return;
        }
        t = new SimpleAwaitable();
        i = 0;
      Label_ContinuationPoint:
        while (i < 3)
        {
            // Label_ContinuationPoint: should be here
            try
            {
                num3 = state;
                if (num3 != 1)
                {
                    Console.WriteLine("In Try");
                    awaiter = t.GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        state = 1;
                        awaiter.OnCompleted(MoveNextDelegate);
                        return;
                    }
                }
                else
                {
                    state = 0;
                }
                int result = awaiter.GetResult();
                awaiter = null;
                returnValue = result;
                goto Label_ReturnStatement;
            }
            catch (Exception)
            {
                Console.WriteLine("Trying again...");
            }
            i++;
        }
        returnValue = 0;
    }
    catch (Exception exception)
    {
        state = -1;
        Builder.SetException(exception);
        return;
    }
  Label_ReturnStatement:
    state = -1;
    Builder.SetResult(returnValue);
}

I had to move Label_ContinuationPoint to make it valid code - otherwise it's not in the scope of the goto statement - but that doesn't affect the answer.

Think about what happens when GetResult throws its exception. We'll go through the catch block, increment i, and then loop round again (assuming i is still less than 3). We're still in whatever state we were before the GetResult call... but when we get inside the try block we must print "In Try" and call GetAwaiter again... and we'll only do that if state isn't 1. Without the state = 0 assignment, it will use the existing awaiter and skip the Console.WriteLine call.

It's a fairly tortuous bit of code to work through, but that just goes to show the kinds of thing that the team has to think about. I'm glad I'm not responsible for implementing this :)

Solution 2 - C#

if it was kept at 1 (first case) you would get a call to EndAwait without a call to BeginAwait. If it's kept at 2 (second case) you'd get the same result just on the other awaiter.

I'm guessing that calling the BeginAwait returns false if it has be started already (a guess from my side) and keeps the original value to return at the EndAwait. If that's the case it would work correctly whereas if you set it to -1 you might have an uninitialized this.<1>t__$await1 for the first case.

This however assumes that BeginAwaiter won't actually start the action on any calls after the first and that it will return false in those cases. Starting would of course be unacceptable since it could have side effect or simply give a different result. It also assumpes that the EndAwaiter will always return the same value no matter how many times it's called and that is can be called when BeginAwait returns false (as per the above assumption)

It would seem to be a guard against race conditions If we inline the statements where movenext is called by a different thread after the state = 0 in questions it woule look something like the below

this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.<>1__state = 0;

//second thread
this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.$__doFinallyBodies = true;
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

//other thread
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

If the assumptions above are correct the there's some unneeded work done such as get sawiater and reassigning the same value to <1>t__$await1. If the state was kept at 1 then the last part would in stead be:

//second thread
//I suppose this un matched call to EndAwait will fail
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

further if it was set to 2 the state machine would assume it already had gotten the value of the first action which would be untrue and a (potentially) unassigned variable would be used to calculate the result

Solution 3 - C#

Could it be something to do with stacked/nested async calls ?..

i.e:

async Task m1()
{
    await m2;
}

async Task m2()
{
    await m3();
}

async Task m3()
{
Thread.Sleep(10000);
}

Does the movenext delegate get called multiple times in this situation ?

Just a punt really?

Solution 4 - C#

Explanation of actual states:

possible states:

  • 0 Initialized (i think so) or waiting for end of operation
  • >0 just called MoveNext, chosing next state
  • -1 ended

Is it possible that this implementation just wants to assure that if another Call to MoveNext from whereever happens (while waiting) it will reevaluate the whole state-chain again from the beginning, to reevaluate results which could be in the mean time already outdated?

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJon SkeetView Question on Stackoverflow
Solution 1 - C#Jon SkeetView Answer on Stackoverflow
Solution 2 - C#Rune FSView Answer on Stackoverflow
Solution 3 - C#GaryMcAllisterView Answer on Stackoverflow
Solution 4 - C#fixagonView Answer on Stackoverflow