python.array versus numpy.array

PythonNumpy

Python Problem Overview


If you are creating a 1d array in Python is there any benefit to using the NumPy package?

Python Solutions


Solution 1 - Python

It all depends on what you plan to do with the array. If all you're doing is creating arrays of simple data types and doing I/O, the array module will do just fine.

If, on the other hand, you want to do any kind of numerical calculations, the array module doesn't provide any help with that. NumPy (and SciPy) give you a wide variety of operations between arrays and special functions that are useful not only for scientific work but for things like advanced image manipulation or in general anything where you need to perform efficient calculations with large amounts of data.

Numpy is also much more flexible, e.g. it supports arrays of any type of Python objects, and is also able to interact "natively" with your own objects if they conform to the array interface.

Solution 2 - Python

Small bootstrapping for the benefit of whoever might find this useful (following the excellent answer by @dF.):

import numpy as np
from array import array

# Fixed size numpy array
def np_fixed(n):
    q = np.empty(n)
    for i in range(n):
        q[i] = i
    return q
        
# Resize with np.resize
def np_class_resize(isize, n):
    q = np.empty(isize)
    for i in range(n):
        if i>=q.shape[0]:
            q = np.resize(q, q.shape[0]*2)        
        q[i] = i
    return q    
    
# Resize with the numpy.array method
def np_method_resize(isize, n):
    q = np.empty(isize)
    for i in range(n):
        if i>=q.shape[0]:
            q.resize(q.shape[0]*2)
        q[i] = i
    return q

# Array.array append
def arr(n):
    q = array('d')
    for i in range(n):
        q.append(i)
    return q

isize = 1000
n = 10000000

The output gives:

%timeit -r 10 a = np_fixed(n)
%timeit -r 10 a = np_class_resize(isize, n)
%timeit -r 10 a = np_method_resize(isize, n)
%timeit -r 10 a = arr(n)

1 loop, best of 10: 868 ms per loop
1 loop, best of 10: 2.03 s per loop
1 loop, best of 10: 2.02 s per loop
1 loop, best of 10: 1.89 s per loop

It seems that array.array is slightly faster and the 'api' saves you some hassle, but if you need more than just storing doubles then numpy.resize is not a bad choice after all (if used correctly).

Solution 3 - Python

For storage purposes, numpy array beats array.array. Here is the code for benchmark for both comparing storage size of unsigned integer of 4 bytes. Other datatypes can also be used for comparison. Data of list and tuple is also added for comparison. For numbers smaller than 10, list and tuples can be used.

a = [100400, 1004000, 10040, 100400, 100400, 100400, 400, 1006000, 8, 800999]
print("size per element for list, tuple, numpy array, array.array:===============")
for i in range(1, 15):
    aa = a*i#list size multiplier
    n = len(aa)
    list_size = getsizeof(aa)
    d = list_size
    tup_aa = tuple(aa)
    tup_size = getsizeof(tup_aa)
    nparr = np.array(aa, dtype='uint32')
    np_size = getsizeof(nparr)
    arr = array('L', aa)#4 byte unsigned integer
    arr_size = getsizeof(arr)
    print('number of element:%s, list %.2f, tuple %.2f, np.array %.2f, arr.array %.2f' % \
          (len(aa), list_size/n, tup_size/n, np_size/n, arr_size/n))

This was producing the following ouput in my machine:

size per element for list, tuple, numpy array, array.array:===============
number of element:10, list 14.40, tuple 12.80, np.array 13.60, arr.array 14.40
number of element:20, list 11.20, tuple 10.40, np.array 8.80, arr.array 11.20
number of element:30, list 10.13, tuple 9.60, np.array 7.20, arr.array 10.13
number of element:40, list 9.60, tuple 9.20, np.array 6.40, arr.array 9.60
number of element:50, list 9.28, tuple 8.96, np.array 5.92, arr.array 9.28
number of element:60, list 9.07, tuple 8.80, np.array 5.60, arr.array 9.07
number of element:70, list 8.91, tuple 8.69, np.array 5.37, arr.array 8.91
number of element:80, list 8.80, tuple 8.60, np.array 5.20, arr.array 8.80
number of element:90, list 8.71, tuple 8.53, np.array 5.07, arr.array 8.71
number of element:100, list 8.64, tuple 8.48, np.array 4.96, arr.array 8.64
number of element:110, list 8.58, tuple 8.44, np.array 4.87, arr.array 8.58
number of element:120, list 8.53, tuple 8.40, np.array 4.80, arr.array 8.53
number of element:130, list 8.49, tuple 8.37, np.array 4.74, arr.array 8.49
number of element:140, list 8.46, tuple 8.34, np.array 4.69, arr.array 8.46

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHortitudeView Question on Stackoverflow
Solution 1 - PythondF.View Answer on Stackoverflow
Solution 2 - PythonnivnivView Answer on Stackoverflow
Solution 3 - PythonAlok NayakView Answer on Stackoverflow