How to join two generators in Python?
PythonGeneratorPython Problem Overview
I want to change the following code
for directory, dirs, files in os.walk(directory_1):
do_something()
for directory, dirs, files in os.walk(directory_2):
do_something()
to this code:
for directory, dirs, files in os.walk(directory_1) + os.walk(directory_2):
do_something()
I get the error:
> unsupported operand type(s) for +: 'generator' and 'generator'
How to join two generators in Python?
Python Solutions
Solution 1 - Python
itertools.chain()
should do it.
It takes a list of iterables and yields from each one by one
def chain(*iterables):
for it in iterables:
for element in it:
yield element
Usage example:
from itertools import chain
generator = chain('ABC', 'DEF')
for item in generator:
print(item)
Output:
A
B
C
D
E
F
Solution 2 - Python
A example of code:
from itertools import chain
def generator1():
for item in 'abcdef':
yield item
def generator2():
for item in '123456':
yield item
generator3 = chain(generator1(), generator2())
for item in generator3:
print item
Solution 3 - Python
In Python (3.5 or greater) you can do:
def concat(a, b):
yield from a
yield from b
Solution 4 - Python
Simple example:
from itertools import chain
x = iter([1,2,3]) #Create Generator Object (listiterator)
y = iter([3,4,5]) #another one
result = chain(x, y) #Chained x and y
Solution 5 - Python
With itertools.chain.from_iterable you can do things like:
def genny(start):
for x in range(start, start+3):
yield x
y = [1, 2]
ab = [o for o in itertools.chain.from_iterable(genny(x) for x in y)]
print(ab)
Solution 6 - Python
Here it is using a generator expression with nested for
s:
a = range(3)
b = range(5)
ab = (i for it in (a, b) for i in it)
assert list(ab) == [0, 1, 2, 0, 1, 2, 3, 4]
Solution 7 - Python
One can also use unpack operator *
:
concat = (*gen1(), *gen2())
NOTE: Works most efficiently for 'non-lazy' iterables. Can also be used with different kind of comprehensions. Preferred way for generator concat would be from the answer from @Uduse
Solution 8 - Python
2020 update: Work in both python 3 and python 2
import itertools
iterA = range(10,15)
iterB = range(15,20)
iterC = range(20,25)
### first option
for i in itertools.chain(iterA, iterB, iterC):
print(i)
# 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
### alternative option, introduced in python 2.6
for i in itertools.chain.from_iterable( [iterA, iterB, iterC] ):
print(i)
# 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
itertools.chain() is the basic.
itertools.chain.from_iterables is handy if you have an iterable of iterables. For example a list of files per subdirectory like [ ["src/server.py", "src/readme.txt"], ["test/test.py"] ]
.
Solution 9 - Python
If you want to keep the generators separate but still iterate over them at the same time you can use zip():
NOTE: Iteration stops at the shorter of the two generators
For example:
for (root1, dir1, files1), (root2, dir2, files2) in zip(os.walk(path1), os.walk(path2)):
for file in files1:
#do something with first list of files
for file in files2:
#do something with second list of files
Solution 10 - Python
Lets say that we have to generators (gen1 and gen 2) and we want to perform some extra calculation that requires the outcome of both. We can return the outcome of such function/calculation through the map method, which in turn returns a generator that we can loop upon.
In this scenario, the function/calculation needs to be implemented via the lambda function. The tricky part is what we aim to do inside the map and its lambda function.
General form of proposed solution:
def function(gen1,gen2):
for item in map(lambda x, y: do_somethin(x,y), gen1, gen2):
yield item
Solution 11 - Python
I would say that, as suggested in comments by user "wjandrea", the best solution is
def concat_generators(*args):
for gen in args:
yield from gen
It does not change the returned type and is really pythonic.
Solution 12 - Python
(Disclaimer: Python 3 only!)
Something with syntax similar to what you want is to use the splat operator to expand the two generators:
for directory, dirs, files in (*os.walk(directory_1), *os.walk(directory_2)):
do_something()
Explanation:
This effectively performs a single-level flattening of the two generators into an N-tuple of 3-tuples (from os.walk
) that looks like:
((directory1, dirs1, files1), (directory2, dirs2, files2), ...)
Your for-loop then iterates over this N-tuple.
Of course, by simply replacing the outer parentheses with brackets, you can get a list of 3-tuples instead of an N-tuple of 3-tuples:
for directory, dirs, files in [*os.walk(directory_1), *os.walk(directory_2)]:
do_something()
This yields something like:
[(directory1, dirs1, files1), (directory2, dirs2, files2), ...]
Pro:
The upside to this approach is that you don't have to import anything and it's not a lot of code.
Con:
The downside is that you dump two generators into a collection and then iterate over that collection, effectively doing two passes and potentially using a lot of memory.
Solution 13 - Python
If you just need to do it once and do not wish to import one more module, there is a simple solutions...
just do:
for dir in directory_1, directory_2:
for directory, dirs, files in os.walk(dir):
do_something()
If you really want to "join" both generators, then do :
for directory, dirs, files in (
x for osw in [os.walk(directory_1), os.walk(directory_2)]
for x in osw
):
do_something()