How do I add the contents of an iterable to a set?

PythonSetConventionsIterable

Python Problem Overview


What is the "one [...] obvious way" to add all items of an iterable to an existing set?

Python Solutions


Solution 1 - Python

You can add elements of a list to a set like this:

>>> foo = set(range(0, 4))
>>> foo
set([0, 1, 2, 3])
>>> foo.update(range(2, 6))
>>> foo
set([0, 1, 2, 3, 4, 5])

Solution 2 - Python

For the benefit of anyone who might believe e.g. that doing aset.add() in a loop would have performance competitive with doing aset.update(), here's an example of how you can test your beliefs quickly before going public:

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 294 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 950 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 458 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 598 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 1.89 msec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 891 usec per loop

Looks like the cost per item of the loop approach is over THREE times that of the update approach.

Using |= set() costs about 1.5x what update does but half of what adding each individual item in a loop does.

Solution 3 - Python

You can use the set() function to convert an iterable into a set, and then use standard set update operator (|=) to add the unique values from your new set into the existing one.

>>> a = { 1, 2, 3 }
>>> b = ( 3, 4, 5 )
>>> a |= set(b)
>>> a
set([1, 2, 3, 4, 5])

Solution 4 - Python

Just a quick update, timings using python 3:

#!/usr/local/bin python3
from timeit import Timer

a = set(range(1, 100000))
b = list(range(50000, 150000))

def one_by_one(s, l):
    for i in l:
        s.add(i)    

def cast_to_list_and_back(s, l):
    s = set(list(s) + l)

def update_set(s,l):
    s.update(l)

results are:

one_by_one 10.184448844986036
cast_to_list_and_back 7.969255169969983
update_set 2.212590195937082

Solution 5 - Python

Use list comprehension.

Short circuiting the creation of iterable using a list for example :)

>>> x = [1, 2, 3, 4]
>>> 
>>> k = x.__iter__()
>>> k
<listiterator object at 0x100517490>
>>> l = [y for y in k]
>>> l
[1, 2, 3, 4]
>>> 
>>> z = Set([1,2])
>>> z.update(l)
>>> z
set([1, 2, 3, 4])
>>> 

[Edit: missed the set part of question]

Solution 6 - Python

for item in items:
   extant_set.add(item)

For the record, I think the assertion that "There should be one-- and preferably only one --obvious way to do it." is bogus. It makes an assumption that many technical minded people make, that everyone thinks alike. What is obvious to one person is not so obvious to another.

I would argue that my proposed solution is clearly readable, and does what you ask. I don't believe there are any performance hits involved with it--though I admit I might be missing something. But despite all of that, it might not be obvious and preferable to another developer.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionIan MackinnonView Question on Stackoverflow
Solution 1 - PythonSingleNegationEliminationView Answer on Stackoverflow
Solution 2 - PythonJohn MachinView Answer on Stackoverflow
Solution 3 - PythongbcView Answer on Stackoverflow
Solution 4 - PythonDaniel DubovskiView Answer on Stackoverflow
Solution 5 - PythonpyfuncView Answer on Stackoverflow
Solution 6 - PythonjaydelView Answer on Stackoverflow