join list of lists in python

Python

Python Problem Overview


Is the a short syntax for joining a list of lists into a single list( or iterator) in python?

For example I have a list as follows and I want to iterate over a,b and c.

x = [["a","b"], ["c"]]

The best I can come up with is as follows.

result = []
[ result.extend(el) for el in x] 

for el in result:
  print el

Python Solutions


Solution 1 - Python

import itertools
a = [['a','b'], ['c']]
print(list(itertools.chain.from_iterable(a)))

Solution 2 - Python

x = [["a","b"], ["c"]]

result = sum(x, [])

Solution 3 - Python

If you're only going one level deep, a nested comprehension will also work:

>>> x = [["a","b"], ["c"]]
>>> [inner
...     for outer in x
...         for inner in outer]
['a', 'b', 'c']

On one line, that becomes:

>>> [j for i in x for j in i]
['a', 'b', 'c']

Solution 4 - Python

l = []
map(l.extend, list_of_lists)

shortest!

Solution 5 - Python

This is known as flattening, and there are a LOT of implementations out there.

How about this, although it will only work for 1 level deep nesting:

>>> x = [["a","b"], ["c"]]
>>> for el in sum(x, []):
...     print el
...
a
b
c

From those links, apparently the most complete-fast-elegant-etc implementation is the following:

def flatten(l, ltypes=(list, tuple)):
    ltype = type(l)
    l = list(l)
    i = 0
    while i < len(l):
        while isinstance(l[i], ltypes):
            if not l[i]:
                l.pop(i)
                i -= 1
                break
            else:
                l[i:i + 1] = l[i]
        i += 1
    return ltype(l)

Solution 6 - Python

If you need a list, not a generator, use list():

from itertools import chain
x = [["a","b"], ["c"]]
y = list(chain(*x))

Solution 7 - Python

A performance comparison:

import itertools
import timeit
big_list = [[0]*1000 for i in range(1000)]
timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100)
timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100)
timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100)
timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100)
[100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]

Producing:

>>> import itertools
>>> import timeit
>>> big_list = [[0]*1000 for i in range(1000)]
>>> timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100)
[3.016212113769325, 3.0148865239060227, 3.0126415732791028]
>>> timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100)
[3.019953987082083, 3.528754223385439, 3.02181439266457]
>>> timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100)
[1.812084445152557, 1.7702404451095965, 1.7722977998725362]
>>> timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100)
[5.409658160700605, 5.477502077679354, 5.444318360412744]
>>> [100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]
[399.27587954973444, 400.9240571138051, 403.7521153804846]

This is with Python 2.7.1 on Windows XP 32-bit, but @temoto in the comments above got from_iterable to be faster than map+extend, so it's quite platform and input dependent.

Stay away from sum(big_list, [])

Solution 8 - Python

This works recursively for infinitely nested elements:

def iterFlatten(root):
    if isinstance(root, (list, tuple)):
        for element in root:
            for e in iterFlatten(element):
                yield e
    else:
        yield root

Result:

>>> b = [["a", ("b", "c")], "d"]
>>> list(iterFlatten(b))
['a', 'b', 'c', 'd']

Solution 9 - Python

Late to the party but ...

I'm new to python and come from a lisp background. This is what I came up with (check out the var names for lulz):

def flatten(lst):
    if lst:
        car,*cdr=lst
        if isinstance(car,(list,tuple)):
            if cdr: return flatten(car) + flatten(cdr)
            return flatten(car)
        if cdr: return [car] + flatten(cdr)
        return [car]

Seems to work. Test:

flatten((1,2,3,(4,5,6,(7,8,(((1,2)))))))

returns:

[1, 2, 3, 4, 5, 6, 7, 8, 1, 2]

Solution 10 - Python

What you're describing is known as flattening a list, and with this new knowledge you'll be able to find many solutions to this on Google (there is no built-in flatten method). Here is one of them, from <http://www.daniel-lemire.com/blog/archives/2006/05/10/flattening-lists-in-python/>;:

def flatten(x):
    flat = True
    ans = []
    for i in x:
        if ( i.__class__ is list):
            ans = flatten(i)
        else:
            ans.append(i)
    return ans

Solution 11 - Python

There's always reduce (being deprecated to functools):

>>> x = [ [ 'a', 'b'], ['c'] ]
>>> for el in reduce(lambda a,b: a+b, x, []):
...  print el
...
__main__:1: DeprecationWarning: reduce() not supported in 3.x; use functools.reduce()
a
b
c
>>> import functools
>>> for el in functools.reduce(lambda a,b: a+b, x, []):
...   print el
...
a
b
c
>>>

Unfortunately the plus operator for list concatenation can't be used as a function -- or fortunate, if you prefer lambdas to be ugly for improved visibility.

Solution 12 - Python

Or a recursive operation:

def flatten(input):
    ret = []
    if not isinstance(input, (list, tuple)):
        return [input]
    for i in input:
        if isinstance(i, (list, tuple)):
            ret.extend(flatten(i))
        else:
            ret.append(i)
    return ret

Solution 13 - Python

For one-level flatten, if you care about speed, this is faster than any of the previous answers under all conditions I tried. (That is, if you need the result as a list. If you only need to iterate through it on the fly then the chain example is probably better.) It works by pre-allocating a list of the final size and copying the parts in by slice (which is a lower-level block copy than any of the iterator methods):

def join(a):
    """Joins a sequence of sequences into a single sequence.  (One-level flattening.)
    E.g., join([(1,2,3), [4, 5], [6, (7, 8, 9), 10]]) = [1,2,3,4,5,6,(7,8,9),10]
    This is very efficient, especially when the subsequences are long.
    """
    n = sum([len(b) for b in a])
    l = [None]*n
    i = 0
    for b in a:
        j = i+len(b)
        l[i:j] = b
        i = j
    return l

Sorted times list with comments:

[(0.5391559600830078, 'flatten4b'), # join() above. 
(0.5400412082672119, 'flatten4c'), # Same, with sum(len(b) for b in a) 
(0.5419249534606934, 'flatten4a'), # Similar, using zip() 
(0.7351131439208984, 'flatten1b'), # list(itertools.chain.from_iterable(a)) 
(0.7472689151763916, 'flatten1'), # list(itertools.chain(*a)) 
(1.5468521118164062, 'flatten3'), # [i for j in a for i in j] 
(26.696547985076904, 'flatten2')] # sum(a, [])

Solution 14 - Python

Sadly, Python doesn't have a simple way to flatten lists. Try this:

def flatten(some_list):
    for element in some_list:
        if type(element) in (tuple, list):
            for item in flatten(element):
                yield item
        else:
            yield element

Which will recursively flatten a list; you can then do

result = []
[ result.extend(el) for el in x] 

for el in flatten(result):
      print el

Solution 15 - Python

I had a similar problem when I had to create a dictionary that contained the elements of an array and their count. The answer is relevant because, I flatten a list of lists, get the elements I need and then do a group and count. I used Python's map function to produce a tuple of element and it's count and groupby over the array. Note that the groupby takes the array element itself as the keyfunc. As a relatively new Python coder, I find it to me more easier to comprehend, while being Pythonic as well.

Before I discuss the code, here is a sample of data I had to flatten first:

{ "_id" : ObjectId("4fe3a90783157d765d000011"), "status" : [ "opencalais" ],
  "content_length" : 688, "open_calais_extract" : { "entities" : [
  {"type" :"Person","name" : "Iman Samdura","rel_score" : 0.223 }, 
  {"type" : "Company", 	"name" : "Associated Press", 	"rel_score" : 0.321 },          
  {"type" : "Country", 	"name" : "Indonesia", 	"rel_score" : 0.321 }, ... ]},
  "title" : "Indonesia Police Arrest Bali Bomb Planner", "time" : "06:42  ET",         
  "filename" : "021121bn.01", "month" : "November", "utctime" : 1037836800,
  "date" : "November 21, 2002", "news_type" : "bn", "day" : "21" }

It is a query result from Mongo. The code below flattens a collection of such lists.

def flatten_list(items):
  return sorted([entity['name'] for entity in [entities for sublist in  
   [item['open_calais_extract']['entities'] for item in items] 
   for entities in sublist])

First, I would extract all the "entities" collection, and then for each entities collection, iterate over the dictionary and extract the name attribute.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKozyarchukView Question on Stackoverflow
Solution 1 - PythonCTTView Answer on Stackoverflow
Solution 2 - PythonChristian C. SalvadóView Answer on Stackoverflow
Solution 3 - PythonGeorge V. ReillyView Answer on Stackoverflow
Solution 4 - Pythonnate cView Answer on Stackoverflow
Solution 5 - PythonPaolo BergantinoView Answer on Stackoverflow
Solution 6 - PythonculebrónView Answer on Stackoverflow
Solution 7 - PythonEvgeni SergeevView Answer on Stackoverflow
Solution 8 - PythonGeorg SchöllyView Answer on Stackoverflow
Solution 9 - PythonMichael PuckettView Answer on Stackoverflow
Solution 10 - PythonPaige RutenView Answer on Stackoverflow
Solution 11 - PythonAaronView Answer on Stackoverflow
Solution 12 - PythonsvhbView Answer on Stackoverflow
Solution 13 - PythonBrandynView Answer on Stackoverflow
Solution 14 - PythonDon WerveView Answer on Stackoverflow
Solution 15 - PythonKarthik GomadamView Answer on Stackoverflow