When to use BlockingCollection and when ConcurrentBag instead of List<T>?

C#WpfMultithreadingLinqTask Parallel-Library

C# Problem Overview


The accepted answer to question "Why does this Parallel.ForEach code freeze the program up?" advises to substitute the List usage by ConcurrentBag in a WPF application.

I'd like to understand whether a BlockingCollection can be used in this case instead?

C# Solutions


Solution 1 - C#

You can indeed use a BlockingCollection, but there is absolutely no point in doing so.

First off, note that BlockingCollection is a wrapper around a collection that implements IProducerConsumerCollection<T>. Any type that implements that interface can be used as the underlying storage:

> When you create a BlockingCollection<T> object, you can specify not > only the bounded capacity but also the type of collection to use. For > example, you could specify a ConcurrentQueue<T> object for first in, > first out (FIFO) behavior, or a ConcurrentStack<T> object for last > in,first out (LIFO) behavior. You can use any collection class that > implements the IProducerConsumerCollection<T> interface. The default > collection type for BlockingCollection<T> is ConcurrentQueue<T>.

This includes ConcurrentBag<T>, which means you can have a blocking concurrent bag. So what's the difference between a plain IProducerConsumerCollection<T> and a blocking collection? The documentation of BlockingCollection says (emphasis mine):

> BlockingCollection<T> is used as a wrapper for an > IProducerConsumerCollection<T> instance, allowing removal attempts > from the collection to block until data is available to be removed. > Similarly, a BlockingCollection<T> can be created to enforce an > upper-bound on the number of data elements allowed in the > IProducerConsumerCollection<T> [...]

Since in the linked question there is no need to do either of these things, using BlockingCollection simply adds a layer of functionality that goes unused.

Solution 2 - C#

  • List<T> is a collection designed to use in single thread applications.

  • ConcurrentBag<T> is a class of Collections.Concurrent namespace designed to simplify using collections in multi-thread environments. If you use ConcurrentCollection you will not have to lock your collection to prevent corruption by other threads. You can insert or take data from your collection with no need to write special locking codes.

  • BlockingCollection<T> is designed to get rid of the requirement of checking if new data is available in the shared collection between threads. if there is new data inserted into the shared collection then your consumer thread will awake immediately. So you do not have to check if new data is available for consumer thread in certain time intervals typically in a while loop.

Solution 3 - C#

Whenever you find the need for a thread-safe List<T>, in most cases neither the ConcurrentBag<T> nor the BlockingCollection<T> are going to be your best option. Both collections are specialized for facilitating producer-consumer scenarios, so unless you have more than one threads that are concurrently adding and removing items from the collection, you should look for other options (with the best candidate being the ConcurrentQueue<T> in most cases).

Regarding especially the ConcurrentBag<T>, it's an extremely specialized class targeting mixed producer-consumer scenarios. This means that each worker-thread is expected to be both a producer and a consumer (that adds and removes items from the same collection). It could be a good candidate for the internal storage of an [ObjectPool][3] class, but beyond that it is hard to imagine any advantageous usage scenario for this class.

People usually think that the ConcurrentBag<T> is the thread-safe equivalent of a List<T>, but it's not. The similarity of the two APIs is misleading. Calling Add to a List<T> results to adding an item at the end of the list. Calling [Add][4] to a ConcurrentBag<T> results instead to the item being added at a random slot inside the bag. The ConcurrentBag<T> is essentially unordered. It is [not optimized][5] for being enumerated, and does a lousy job when it is commanded to do so. It maintains internally a bunch of thread-local queues, so the order of its contents is dominated by which thread did what, not by when did something happened.

These characteristics make the ConcurrentBag<T> a less than ideal choice for storing the results of a Parallel.For/Parallel.ForEach loop.

A better thread-safe substitute of the List<T>.Add is the [ConcurrentQueue<T>.Enqueue][6] method. "Enqueue" is a less familiar word than "Add", but it actually does what you expect it to do.

There is nothing that a ConcurrentBag<T> can do that a ConcurrentQueue<T> can't. For example neither collection offers a way to [remove a specific item][7] from the collection. If you want a concurrent collection with a TryRemove method that has a key parameter, you could look at the [ConcurrentDictionary<K,V>][8] class.

[3]: https://docs.microsoft.com/en-us/dotnet/standard/collections/thread-safe/how-to-create-an-object-pool "Create an object pool by using a ConcurrentBag" [4]: https://docs.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentbag-1.add [5]: https://github.com/dotnet/runtime/issues/14835 "Improve ConcurrentBag GetEnumerator performance" [6]: https://docs.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentqueue-1.enqueue [7]: https://stackoverflow.com/questions/3029818/how-to-remove-a-single-specific-object-from-a-concurrentbag [8]: https://docs.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentdictionary-2.tryremove

Solution 4 - C#

Yes, you could use BlockingCollection for that. finishedProxies would be defined as:

BlockingCollection<string> finishedProxies = new BlockingCollection<string>();

and to add an item, you would write:

finishedProxies.Add(checkResult);

And when it's done, you could create a list from the contents.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionFulproofView Question on Stackoverflow
Solution 1 - C#JonView Answer on Stackoverflow
Solution 2 - C#Ahmet ArslanView Answer on Stackoverflow
Solution 3 - C#Theodor ZouliasView Answer on Stackoverflow
Solution 4 - C#Jim MischelView Answer on Stackoverflow