Time complexity of python set operations?

PythonData StructuresSetComplexity TheoryBig O

Python Problem Overview


What is the the time complexity of each of python's set operations in Big O notation?

I am using Python's set type for an operation on a large number of items. I want to know how each operation's performance will be affected by the size of the set. For example, add, and the test for membership:

myset = set()
myset.add('foo')
'foo' in myset

Googling around hasn't turned up any resources, but it seems reasonable that the time complexity for Python's set implementation would have been carefully considered.

If it exists, a link to something like this would be great. If nothing like this is out there, then perhaps we can work it out?

Extra marks for finding the time complexity of all set operations.

Python Solutions


Solution 1 - Python

According to Python wiki: Time complexity, set is implemented as a hash table. So you can expect to lookup/insert/delete in O(1) average. Unless your hash table's load factor is too high, then you face collisions and O(n).

P.S. for some reason they claim O(n) for delete operation which looks like a mistype.

P.P.S. This is true for CPython, pypy is a different story.

Solution 2 - Python

The operation in should be independent from he size of the container, ie. O(1) -- given an optimal hash function. This should be nearly true for Python strings. Hashing strings is always critical, Python should be clever there and thus you can expect near-optimal results.

Solution 3 - Python

The other answers do not talk about 2 crucial operations on sets: Unions and intersections. In the worst case, union will take O(n+m) whereas intersection will take O(min(x,y)) provided that there are not many element in the sets with the same hash. A list of time complexities of common operations can be found here: https://wiki.python.org/moin/TimeComplexity

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionStephen EmslieView Question on Stackoverflow
Solution 1 - PythonSergey RomanovskyView Answer on Stackoverflow
Solution 2 - PythontowiView Answer on Stackoverflow
Solution 3 - PythonFırat KıyakView Answer on Stackoverflow