Starting Tasks In foreach Loop Uses Value of Last Item
C#MultithreadingTask Parallel-LibraryC# Problem Overview
I am making a first attempt at playing with the new Tasks, but something is happening that I don't understand.
First, the code, which is pretty straight-forward. I pass in a list of paths to some image files, and attempt to add a task to process each of them:
public Boolean AddPictures(IList<string> paths)
{
Boolean result = (paths.Count > 0);
List<Task> tasks = new List<Task>(paths.Count);
foreach (string path in paths)
{
var task = Task.Factory.StartNew(() =>
{
Boolean taskResult = ProcessPicture(path);
return taskResult;
});
task.ContinueWith(t => result &= t.Result);
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
return result;
}
I've found that if I just let this run with, say, a list of 3 paths in a unit test, all three tasks use the last path in the provided list. If I step through (and slow down the processing of the loop), each path from the loop is used.
Can somebody please explain what is happening, and why? Possible workarounds?
C# Solutions
Solution 1 - C#
You're closing over the loop variable. Don't do that. Take a copy instead:
foreach (string path in paths)
{
string pathCopy = path;
var task = Task.Factory.StartNew(() =>
{
Boolean taskResult = ProcessPicture(pathCopy);
return taskResult;
});
// See note at end of post
task.ContinueWith(t => result &= t.Result);
tasks.Add(task);
}
Your current code is capturing path
- not the value of it when you create the task, but the variable itself. That variable changes value each time you go through the loop - so it can easily change by the time your delegate is called.
By taking a copy of the variable, you're introducing a new variable each time you go through the loop - when you capture that variable, it won't be changed in the next iteration of the loop.
Eric Lippert has a pair of blog posts which go into this in a lot more detail: part 1; part 2.
Don't feel bad - this catches almost everyone out :(
Note about this line:
task.ContinueWith(t => result &= t.Result);
As pointed out in comments, this isn't thread-safe. Multiple threads could execute it at the same time, potentially stamping on each other's results. I haven't added locking or anything similar as it would distract from the main issue that the question is interested, namely variable capture. However, it's worth being aware of.
Solution 2 - C#
The lambda that you're passing to StartNew
is referencing the path
variable, which changes on each iteration (i.e. your lambda is making use of the reference of path
, rather than just its value). You can create a local copy of it so that you aren't pointing to a version that will change:
foreach (string path in paths)
{
var lambdaPath = path;
var task = Task.Factory.StartNew(() =>
{
Boolean taskResult = ProcessPicture(lambdaPath);
return taskResult;
});
task.ContinueWith(t => result &= t.Result);
tasks.Add(task);
}