Parallel.ForEach and async-await
C#Async AwaitTask Parallel-Libraryparallel.foreachC# Problem Overview
I had such method:
public async Task<MyResult> GetResult()
{
MyResult result = new MyResult();
foreach(var method in Methods)
{
string json = await Process(method);
result.Prop1 = PopulateProp1(json);
result.Prop2 = PopulateProp2(json);
}
return result;
}
Then I decided to use Parallel.ForEach
:
public async Task<MyResult> GetResult()
{
MyResult result = new MyResult();
Parallel.ForEach(Methods, async method =>
{
string json = await Process(method);
result.Prop1 = PopulateProp1(json);
result.Prop2 = PopulateProp2(json);
});
return result;
}
But now I've got an error:
>An asynchronous module or handler completed while an asynchronous operation was still pending.
C# Solutions
Solution 1 - C#
async
doesn't work well with ForEach
. In particular, your async
lambda is being converted to an async void
method. There are a number of reasons to avoid async void
(as I describe in an MSDN article); one of them is that you can't easily detect when the async
lambda has completed. ASP.NET will see your code return without completing the async void
method and (appropriately) throw an exception.
What you probably want to do is process the data concurrently, just not in parallel. Parallel code should almost never be used on ASP.NET. Here's what the code would look like with asynchronous concurrent processing:
public async Task<MyResult> GetResult()
{
MyResult result = new MyResult();
var tasks = Methods.Select(method => ProcessAsync(method)).ToArray();
string[] json = await Task.WhenAll(tasks);
result.Prop1 = PopulateProp1(json[0]);
...
return result;
}
Solution 2 - C#
Alternatively, with the AsyncEnumerator NuGet Package you can do this:
using System.Collections.Async;
public async Task<MyResult> GetResult()
{
MyResult result = new MyResult();
await Methods.ParallelForEachAsync(async method =>
{
string json = await Process(method);
result.Prop1 = PopulateProp1(json);
result.Prop2 = PopulateProp2(json);
}, maxDegreeOfParallelism: 10);
return result;
}
where ParallelForEachAsync
is an extension method.
Solution 3 - C#
.NET 6 finally added Parallel.ForEachAsync, a way to schedule asynchronous work that allows you to control the degree of parallelism:
var urlsToDownload = new []
{
"https://dotnet.microsoft.com",
"https://www.microsoft.com",
"https://twitter.com/shahabfar"
};
var client = new HttpClient();
var options = new ParallelOptions { MaxDegreeOfParallelism = 2 };
await Parallel.ForEachAsync(urlsToDownload, options, async (url, token) =>
{
var targetPath = Path.Combine(Path.GetTempPath(), "http_cache", url);
var response = await client.GetAsync(url, token);
// The request will be canceled in case of an error in another URL.
if (response.IsSuccessStatusCode)
{
using var target = File.OpenWrite(targetPath);
await response.Content.CopyToAsync(target);
}
});
Solution 4 - C#
Ahh, okay. I think I know what's going on now. async method =>
an "async void" which is "fire and forget" (not recommended for anything other than event handlers). This means the caller cannot know when it is completed... So, GetResult
returns while the operation is still running. Although the technical details of my first answer are incorrect, the result is the same here: that GetResult is returning while the operations started by ForEach
are still running. The only thing you could really do is not await
on Process
(so that the lambda is no longer async
) and wait for Process
to complete each iteration. But, that will use at least one thread pool thread to do that and thus stress the pool slightly--likely making use of ForEach
pointless. I would simply not use Parallel.ForEach...