Why can't I preallocate a hashset<T>

C#Hashset

C# Problem Overview


Why can't I preallocate a hashset<T>?

There are times when i might be adding a lot of elements to it and i want to eliminate resizing.

C# Solutions


Solution 1 - C#

Answer below was written in 2011. It's now in .NET 4.7.2 and .NET Core 2.0; it will be in .NET Standard 2.1.


There's no technical reason why this shouldn't be possible - Microsoft just hasn't chosen to expose a constructor with an initial capacity.

If you can call a constructor which takes an IEnumerable<T> and use an implementation of ICollection<T>, I believe that will use the size of the collection as the initial minimum capacity. This is an implementation detail, mind you. The capacity only has to be large enough to store all the distinct elements...

EDIT: I believe that if the capacity turns out to be way larger than it needs to be, the constructor will trim the excess when it's finished finding out how many distinct elements there really are.

Anyway, if you have the collection you're going to add to the HashSet<T> and it implements ICollection<T>, then passing it to the constructor instead of adding the elements one by one is going to be a win, basically :)

EDIT: One workaround would be to use a Dictionary<TKey, TValue> instead of a HashSet<T>, and just not use the values. That won't work in all cases though, as it won't give you the same interface as HashSet<T>.

Solution 2 - C#

The answer by Jon Skeet is almost a complete one. To solve this problem with HashSet<int> I had to do the following:

public class ClassUsingHashSet
{
    private static readonly List<int> PreallocationList
        = Enumerable.Range(0, 10000).ToList();

    public ClassUsingHashSet()
    {
        this.hashSet = new HashSet<int>(PreallocationList);
        this.hashSet.Clear();
    }

    public void Add(int item)
    {
        this.hashSet.Add(item);
    }

    private HashSet<int> hashSet;
}

This trick works because after Clear the HashSet is not trimmed, as described in the documentation:

>The capacity remains unchanged until a call to TrimExcess is made.

Solution 3 - C#

I'm using this code to set initial capacity for HashSet. You can use it as extension or directly

public static class HashSetExtensions
{
    private const BindingFlags Flags = BindingFlags.Instance | BindingFlags.NonPublic;
    public static HashSet<T> SetCapacity<T>(this HashSet<T> hs, int capacity)
    {
        var initialize = hs.GetType().GetMethod("Initialize", Flags);
        initialize.Invoke(hs, new object[] { capacity });
        return hs;
    }

    public static HashSet<T> GetHashSet<T>(int capacity)
    {
        return new HashSet<T>().SetCapacity(capacity);
    }
}

upd. 04 jule

This code may be also enhanced by using reflection caching. Here we go:

public static class HashSetExtensions
{
    private static class HashSetDelegateHolder<T>
    {
        private const BindingFlags Flags = BindingFlags.Instance | BindingFlags.NonPublic;
        public static MethodInfo InitializeMethod { get; } = typeof(HashSet<T>).GetMethod("Initialize", Flags);
    }

    public static void SetCapacity<T>(this HashSet<T> hs, int capacity)
    {
        HashSetDelegateHolder<T>.InitializeMethod.Invoke(hs, new object[] { capacity });
    }

    public static HashSet<T> GetHashSet<T>(int capacity)
    {
        var hashSet = new HashSet<T>();
        hashSet.SetCapacity(capacity);
        return hashSet;
    }
}

Solution 4 - C#

This capability was added in 4.7.2:

HashSet<T>(Int32)

Initializes a new instance of the HashSet<T> class that is empty, 
but has reserved space for capacity items and uses the default 
equality comparer for the set type.

Solution 5 - C#

The only way to initialize the HashSet with an initial capacity is to construct it with a instnace of a class, such as a List<T>, that implements ICollection<T>. It will call Count on the ICollection<T> allocate enough space to hold the collection and add all the elements to the HashSet without reallocation.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRyan LeeView Question on Stackoverflow
Solution 1 - C#Jon SkeetView Answer on Stackoverflow
Solution 2 - C#BartoszKPView Answer on Stackoverflow
Solution 3 - C#Alex ZhukovskiyView Answer on Stackoverflow
Solution 4 - C#David WohlferdView Answer on Stackoverflow
Solution 5 - C#chuckjView Answer on Stackoverflow