Initialise numpy array of unknown length

PythonArraysNumpy

Python Problem Overview


I want to be able to 'build' a numpy array on the fly, I do not know the size of this array in advance.

For example I want to do something like this:

a= np.array()
for x in y:
     a.append(x)

Which would result in a containing all the elements of x, obviously this is a trivial answer. I am just curious whether this is possible?

Python Solutions


Solution 1 - Python

Build a Python list and convert that to a Numpy array. That takes amortized O(1) time per append + O(n) for the conversion to array, for a total of O(n).

    a = []
    for x in y:
        a.append(x)
    a = np.array(a)

Solution 2 - Python

You can do this:

a = np.array([])
for x in y:
    a = np.append(a, x)

Solution 3 - Python

Since y is an iterable I really do not see why the calls to append:

a = np.array(list(y))

will do and it's much faster:

import timeit

print timeit.timeit('list(s)', 's=set(x for x in xrange(1000))')
# 23.952975494633154

print timeit.timeit("""li=[]
for x in s: li.append(x)""", 's=set(x for x in xrange(1000))')
# 189.3826994248866

Solution 4 - Python

For posterity, I think this is quicker:

a = np.array([np.array(list()) for _ in y])

You might even be able to pass in a generator (i.e. [] -> ()), in which case the inner list is never fully stored in memory.


Responding to comment below:

>>> import numpy as np
>>> y = range(10)
>>> a = np.array([np.array(list) for _ in y])
>>> a
array([array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object),
       array(<type 'list'>, dtype=object)], dtype=object)

Solution 5 - Python

a = np.empty(0)
for x in y:
    a = np.append(a, x)

Solution 6 - Python

I wrote a small utility function. (most answers above are good. I feel this looks nicer)

def np_unknown_cat(acc, arr):
  arrE = np.expand_dims(arr, axis=0)
  if acc is None:
    return arrE
  else:
    return np.concatenate((acc, arrE))

You can use the above function as the following:

acc = None  # accumulator
arr1 = np.ones((3,4))
acc = np_unknown_cat(acc, arr1)
arr2 = np.ones((3,4))
acc = np_unknown_cat(acc, arr2)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1220022View Question on Stackoverflow
Solution 1 - PythonFred FooView Answer on Stackoverflow
Solution 2 - PythonalexisdmView Answer on Stackoverflow
Solution 3 - PythonMr_and_Mrs_DView Answer on Stackoverflow
Solution 4 - PythonBenDundeeView Answer on Stackoverflow
Solution 5 - PythonchiefenneView Answer on Stackoverflow
Solution 6 - PythonShubham AgrawalView Answer on Stackoverflow