es6 Map and Set complexity, v8 implementation

JavascriptEcmascript 6SetComplexity TheoryV8

Javascript Problem Overview


Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

(I know that the standard doesn't guarantee that)

Javascript Solutions


Solution 1 - Javascript

> Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

Yes. V8 uses a variant of hash tables that generally have O(1) complexity for these operations.

For details, you might want to have a look at https://codereview.chromium.org/220293002/ where OrderedHashTable is implemented based on https://wiki.mozilla.org/User:Jorend/Deterministic_hash_tables.

Solution 2 - Javascript

For people don't want to dig into the rabbit hole too deep:

1: We can assume good hash table implementations have practically O(1) time complexity.
2: Here is a blog posted by V8 team explains how some memory optimization was done on its hashtable implementation for Map, Set, WeakSet, and WeakMap: Optimizing hash tables: hiding the hash code

Based on 1 and 2: V8's Set and Map's get & set & add & has time complexity practically is O(1).

Solution 3 - Javascript

let map = new Map();
let obj = {};

const benchMarkMapSet = size => {
console.time("benchMarkMapSet");
for (let i = 0; i < size; i++) {
    map.set(i, i);
}
console.timeEnd("benchMarkMapSet");
};

const benchMarkMapGet = size => {
console.time("benchMarkMapGet");
for (let i = 0; i < size; i++) {
    let x = map.get(i);
}
console.timeEnd("benchMarkMapGet");
};

const benchMarkObjSet = size => {
console.time("benchMarkObjSet");
for (let i = 0; i < size; i++) {
    obj[i] = i;
}
console.timeEnd("benchMarkObjSet");
};

const benchMarkObjGet = size => {
console.time("benchMarkObjGet");
for (let i = 0; i < size; i++) {
    let x = obj[i];
}
console.timeEnd("benchMarkObjGet");
};

let size = 2e6;

benchMarkMapSet(size);
benchMarkObjSet(size);
benchMarkMapGet(size);
benchMarkObjGet(size);

benchMarkMapSet: 382.935ms benchMarkObjSet: 76.077ms benchMarkMapGet: 125.478ms benchMarkObjGet: 2.764ms

benchMarkMapSet: 373.172ms benchMarkObjSet: 77.192ms benchMarkMapGet: 123.035ms benchMarkObjGet: 2.638ms

Solution 4 - Javascript

The original question is already answered, but O(1) doesn't tell a lot about how efficient the implementation is.

First of all, we need to understand what variation of hash table is used for Maps. "Classical" hash tables won't do as they do not provide any iteration order guarantees, while ES6 spec requires insertion to be in the iteration order. So, Maps in V8 are built on top of so-called deterministic hash tables. The idea is similar to the classical algorithm, but there is another layer of indirection for buckets and all entries are inserted and stored in a contiguous array of a fixed size. Deterministic hash tables algorithm indeed guarantees O(1) time complexity for basic operations, like set or get.

Next, we need to know what's the initial size of the hash table, the load factor, and how (and when) it grows/shrinks. The short answer is: initial size is 4, load factor is 2, the table (i.e. the Map) grows x2 as soon as its capacity is reached, and shrinks as soon as there is more than 1/2 of deleted entries.

Let’s consider the worst-case when the table has N out of N entries (it's full), all entries belong to a single bucket, and the required entry is located at the tail. In such a scenario, a lookup requires N moves through the chain elements.

On the other hand, in the best possible scenario when the table is full, but each bucket has 2 entries, a lookup will require up to 2 moves.

It is a well-known fact that while individual operations in hash tables are “cheap”, rehashing is not. Rehashing has O(N) time complexity and requires allocation of the new hash table on the heap. Moreover, rehashing is performed as a part of insertion or deletion operations, when necessary. So, for instance, a map.set() call could be more expensive than you would expect. Luckily, rehashing is a relatively infrequent operation.

Apart from that, such details as memory layout or hash function also matter, but I'm not going to go into these details here. If you're curious how V8 Maps work under the hood, you may find more details here. I was interested in the topic some time ago and tried to share my findings in a readable manner.

Solution 5 - Javascript

Why don't we just test.

var size_1 = 1000,
    size_2 = 1000000,
    map_sm = new Map(Array.from({length: size_1}, (_,i) => [++i,i])),
    map_lg = new Map(Array.from({length: size_2}, (_,i) => [++i,i])),
    i      = size_1,
    j      = size_2,
    s;

s = performance.now();
while (i) map_sm.get(i--);
console.log(`Size ${size_1} map returns an item in average ${(performance.now()-s)/size_1}ms`);
s = performance.now();
while (j) map_lg.get(j--);
console.log(`Size ${size_2} map returns an item in average ${(performance.now()-s)/size_2}ms`);

So to me it seems like it converges to O(1) as the size grows. Well lets call it O(1) then.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionUriView Question on Stackoverflow
Solution 1 - JavascriptBergiView Answer on Stackoverflow
Solution 2 - JavascriptDavid GuanView Answer on Stackoverflow
Solution 3 - JavascriptAbhay KumarView Answer on Stackoverflow
Solution 4 - JavascriptAndrey PechkurovView Answer on Stackoverflow
Solution 5 - JavascriptReduView Answer on Stackoverflow