How to write Asynchronous LINQ query?

C#LinqLinq to-SqlAsynchronous

C# Problem Overview


After I read a bunch of LINQ related stuff, I suddenly realized that no articles introduce how to write asynchronous LINQ query.

Suppose we use LINQ to SQL, below statement is clear. However, if the SQL database responds slowly, then the thread using this block of code would be hindered.

var result = from item in Products where item.Price > 3 select item.Name;
foreach (var name in result)
{
    Console.WriteLine(name);
}

Seems that current LINQ query spec doesn't provide support to this.

Is there any way to do asynchronous programming LINQ? It works like there is a callback notification when results are ready to use without any blocking delay on I/O.

C# Solutions


Solution 1 - C#

While LINQ doesn't really have this per se, the framework itself does... You can easily roll your own asynchronous query executor in 30 lines or so... In fact, I just threw this together for you :)

EDIT: Through writing this, I've discovered why they didn't implement it. It cannot handle anonymous types since they are scoped local. Thus, you have no way of defining your callback function. This is a pretty major thing since a lot of linq to sql stuff creates them in the select clause. Any of the below suggestions suffer the same fate, so I still think this one is the easiest to use!

EDIT: The only solution is to not use anonymous types. You can declare the callback as just taking IEnumerable (no type args), and use reflection to access the fields (ICK!!). Another way would be to declare the callback as "dynamic"... oh... wait... That's not out yet. :) This is another decent example of how dynamic could be used. Some may call it abuse.

Throw this in your utilities library:

public static class AsynchronousQueryExecutor
{
    public static void Call<T>(IEnumerable<T> query, Action<IEnumerable<T>> callback, Action<Exception> errorCallback)
    {
        Func<IEnumerable<T>, IEnumerable<T>> func =
            new Func<IEnumerable<T>, IEnumerable<T>>(InnerEnumerate<T>);
        IEnumerable<T> result = null;
        IAsyncResult ar = func.BeginInvoke(
                            query,
                            new AsyncCallback(delegate(IAsyncResult arr)
                            {
                                try
                                {
                                    result = ((Func<IEnumerable<T>, IEnumerable<T>>)((AsyncResult)arr).AsyncDelegate).EndInvoke(arr);
                                }
                                catch (Exception ex)
                                {
                                    if (errorCallback != null)
                                    {
                                        errorCallback(ex);
                                    }
                                    return;
                                }
                                //errors from inside here are the callbacks problem
                                //I think it would be confusing to report them
                                callback(result);
                            }),
                            null);
    }
    private static IEnumerable<T> InnerEnumerate<T>(IEnumerable<T> query)
    {
        foreach (var item in query) //the method hangs here while the query executes
        {
            yield return item;
        }
    }
}

And you could use it like this:

class Program
{

    public static void Main(string[] args)
    {
        //this could be your linq query
        var qry = TestSlowLoadingEnumerable();

        //We begin the call and give it our callback delegate
        //and a delegate to an error handler
        AsynchronousQueryExecutor.Call(qry, HandleResults, HandleError);

        Console.WriteLine("Call began on seperate thread, execution continued");
        Console.ReadLine();
    }

    public static void HandleResults(IEnumerable<int> results)
    {
        //the results are available in here
        foreach (var item in results)
        {
            Console.WriteLine(item);
        }
    }

    public static void HandleError(Exception ex)
    {
        Console.WriteLine("error");
    }

    //just a sample lazy loading enumerable
    public static IEnumerable<int> TestSlowLoadingEnumerable()
    {
        Thread.Sleep(5000);
        foreach (var i in new int[] { 1, 2, 3, 4, 5, 6 })
        {
            yield return i;
        }
    }

}

Going to go put this up on my blog now, pretty handy.

Solution 2 - C#

TheSoftwareJedi's and ulrikb's(aka user316318) solutions are good for any LINQ type, but (as pointed by Chris Moschini) do NOT delegating to underlying asynchronous calls that leverage Windows I/O Completion Ports.

Wesley Bakker's Asynchronous DataContext post (triggered by a blog post of Scott Hanselman ) describe class for LINQ to SQL that uses sqlCommand.BeginExecuteReader/sqlCommand.EndExecuteReader, which leverage Windows I/O Completion Ports.

I/O completion ports provide an efficient threading model for processing multiple asynchronous I/O requests on a multiprocessor system.

Solution 3 - C#

Based on Michael Freidgeim's answer and mentioned blog post from Scott Hansellman and fact that you can use async/await, you can implement reusable ExecuteAsync<T>(...) method, which executes underlying SqlCommand asynchronously:

protected static async Task<IEnumerable<T>> ExecuteAsync<T>(IQueryable<T> query,
    DataContext ctx,
    CancellationToken token = default(CancellationToken))
{
    var cmd = (SqlCommand)ctx.GetCommand(query);

    if (cmd.Connection.State == ConnectionState.Closed)
        await cmd.Connection.OpenAsync(token);
    var reader = await cmd.ExecuteReaderAsync(token);

    return ctx.Translate<T>(reader);
}

And then you can (re)use it like this:

public async Task WriteNamesToConsoleAsync(string connectionString, CancellationToken token = default(CancellationToken))
{
    using (var ctx = new DataContext(connectionString))
    {
        var query = from item in Products where item.Price > 3 select item.Name;
        var result = await ExecuteAsync(query, ctx, token);
        foreach (var name in result)
        {
            Console.WriteLine(name);
        }
    }
}

Solution 4 - C#

I started a simple github project named Asynq to do asynchronous LINQ-to-SQL query execution. The idea is quite simple albeit "brittle" at this stage (as of 8/16/2011):

  1. Let LINQ-to-SQL do the "heavy" work of translating your IQueryable into a DbCommand via the DataContext.GetCommand().
  2. For SQL 200[058], cast up from the abstract DbCommand instance you got from GetCommand() to get a SqlCommand. If you're using SQL CE you're out of luck since SqlCeCommand does not expose the async pattern for BeginExecuteReader and EndExecuteReader.
  3. Use BeginExecuteReader and EndExecuteReader off the SqlCommand using the standard .NET framework asynchronous I/O pattern to get yourself a DbDataReader in the completion callback delegate that you pass to the BeginExecuteReader method.
  4. Now we have a DbDataReader which we have no idea what columns it contains nor how to map those values back up to the IQueryable's ElementType (most likely to be an anonymous type in the case of joins). Sure, at this point you could hand-write your own column mapper that materializes its results back into your anonymous type or whatever. You'd have to write a new one per each query result type, depending on how LINQ-to-SQL treats your IQueryable and what SQL code it generates. This is a pretty nasty option and I don't recommend it since it's not maintainable nor would it be always correct. LINQ-to-SQL can change your query form depending on the parameter values you pass in, for example query.Take(10).Skip(0) produces different SQL than query.Take(10).Skip(10), and perhaps a different resultset schema. Your best bet is to handle this materialization problem programmatically:
  5. "Re-implement" a simplistic runtime object materializer that pulls columns off the DbDataReader in a defined order according to the LINQ-to-SQL mapping attributes of the ElementType Type for the IQueryable. Implementing this correctly is probably the most challenging part of this solution.

As others have discovered, the DataContext.Translate() method does not handle anonymous types and can only map a DbDataReader directly to a properly attributed LINQ-to-SQL proxy object. Since most queries worth writing in LINQ are going to involve complex joins which inevitably end up requiring anonymous types for the final select clause, it's pretty pointless to use this provided watered-down DataContext.Translate() method anyway.

There are a few minor drawbacks to this solution when leveraging the existing mature LINQ-to-SQL IQueryable provider:

  1. You cannot map a single object instance to multiple anonymous type properties in the final select clause of your IQueryable, e.g. from x in db.Table1 select new { a = x, b = x }. LINQ-to-SQL internally keeps track of which column ordinals map to which properties; it does not expose this information to the end user so you have no idea which columns in the DbDataReader are reused and which are "distinct".
  2. You cannot include constant values in your final select clause - these do not get translated into SQL and will be absent from the DbDataReader so you'd have to build custom logic to pull these constant values up from the IQueryable's Expression tree, which would be quite a hassle and is simply not justifiable.

I'm sure there are other query patterns that might break but these are the two biggest I could think of that could cause problems in an existing LINQ-to-SQL data access layer.

These problems are easy to defeat - simply don't do them in your queries since neither pattern provides any benefit to the end result of the query. Hopefully this advice applies to all query patterns that would potentially cause object materialization problems :-P. It's a hard problem to solve not having access to LINQ-to-SQL's column mapping information.

A more "complete" approach to solving the problem would be to effectively re-implement nearly all of LINQ-to-SQL, which is a bit more time-consuming :-P. Starting from a quality, open-source LINQ-to-SQL provider implementation would be a good way to go here. The reason you'd need to reimplement it is so that you'd have access to all of the column mapping information used to materialize the DbDataReader results back up to an object instance without any loss of information.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMorgan ChengView Question on Stackoverflow
Solution 1 - C#TheSoftwareJediView Answer on Stackoverflow
Solution 2 - C#Michael FreidgeimView Answer on Stackoverflow
Solution 3 - C#NenadView Answer on Stackoverflow
Solution 4 - C#James DunneView Answer on Stackoverflow