Understanding the set() function

PythonSet

Python Problem Overview


In python, set() is an unordered collection with no duplicate elements. However, I am not able to understand how it generates the output.

For example, consider the following:

>>> x = [1, 1, 2, 2, 2, 2, 2, 3, 3]
>>> set(x)
set([1, 2, 3])

>>> y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
>>> set(y)
set([8, 1, 6])

>>> z = [1, 1, 6, 6, 6, 6, 6, 7, 7]
>>> set(z)
set([1, 6, 7])

Shouldn't the output of set(y) be: set([1, 6, 8])? I tried the above two in Python 2.6.

Python Solutions


Solution 1 - Python

Sets are unordered, as you say. Even though one way to implement sets is using a tree, they can also be implemented using a hash table (meaning getting the keys in sorted order may not be that trivial).

If you'd like to sort them, you can simply perform:

sorted(set(y))

which will produce a sorted list containing the set's elements. (Not a set. Again, sets are unordered.)

Otherwise, the only thing guaranteed by set is that it makes the elements unique (nothing will be there more than once).

Hope this helps!

Solution 2 - Python

As an unordered collection type, set([8, 1, 6]) is equivalent to set([1, 6, 8]).

While it might be nicer to display the set contents in sorted order, that would make the repr() call more expensive.

Internally, the set type is implemented using a hash table: a hash function is used to separate items into a number of buckets to reduce the number of equality operations needed to check if an item is part of the set.

To produce the repr() output it just outputs the items from each bucket in turn, which is unlikely to be the sorted order.

Solution 3 - Python

As +Volatility and yourself pointed out, sets are unordered. If you need the elements to be in order, just call sorted on the set:

>>> y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
>>> sorted(set(y))
[1, 6, 8]

Solution 4 - Python

Python's sets (and dictionaries) will iterate and print out in some order, but exactly what that order will be is arbitrary, and not guaranteed to remain the same after additions and removals.

Here's an example of a set changing order after a lot of values are added and then removed:

>>> s = set([1,6,8])
>>> print(s)
{8, 1, 6}
>>> s.update(range(10,100000))
>>> for v in range(10, 100000):
    s.remove(v)
>>> print(s)
{1, 6, 8}

This is implementation dependent though, and so you should not rely upon it.

Solution 5 - Python

After reading the other answers, I still had trouble understanding why the set comes out un-ordered.

Mentioned this to my partner and he came up with this metaphor: take marbles. You put them in a tube a tad wider than marble width : you have a list. A set, however, is a bag. Even though you feed the marbles one-by-one into the bag; when you pour them from a bag back into the tube, they will not be in the same order (because they got all mixed up in a bag).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPrakhar MehrotraView Question on Stackoverflow
Solution 1 - PythonuserView Answer on Stackoverflow
Solution 2 - PythonJames HenstridgeView Answer on Stackoverflow
Solution 3 - PythonI82MuchView Answer on Stackoverflow
Solution 4 - PythonBlckknghtView Answer on Stackoverflow
Solution 5 - PythonAniaView Answer on Stackoverflow