Why are NumPy arrays so fast?

PythonArraysNumpy

Python Problem Overview


I just changed a program I am writing to hold my data as numpy arrays as I was having performance issues, and the difference was incredible. It originally took 30 minutes to run and now takes 2.5 seconds!

I was wondering how it does it. I assume it is that the because it removes the need for for loops but beyond that I am stumped.

Python Solutions


Solution 1 - Python

Numpy arrays are densely packed arrays of homogeneous type. Python lists, by contrast, are arrays of pointers to objects, even when all of them are of the same type. So, you get the benefits of locality of reference.

Also, many Numpy operations are implemented in C, avoiding the general cost of loops in Python, pointer indirection and per-element dynamic type checking. The speed boost depends on which operations you're performing, but a few orders of magnitude isn't uncommon in number crunching programs.

Solution 2 - Python

numpy arrays are specialized data structures. This means you don't only get the benefits of an efficient in-memory representation, but efficient specialized implementations as well.

E.g. if you are summing up two arrays the addition will be performed with the specialized CPU vector operations, instead of calling the python implementation of int addition in a loop.

Solution 3 - Python

Consider the following code:

import numpy as np
import time

a = np.random.rand(1000000)
b = np.random.rand(1000000)

tic = time.time()
c = np.dot(a, b)
toc = time.time()

print("Vectorised version: " + str(1000*(toc-tic)) + "ms")

c = 0
tic = time.time()
for i in range(1000000):
    c += a[i] * b[i]
toc = time.time()

print("For loop: " + str(1000*(toc-tic)) + "ms")

Output:

Vectorised version: 2.011537551879883ms
For loop: 539.8685932159424ms

Here Numpy is much faster because it takes advantage of parallelism (which is the case of Single Instruction Multiple Data (SIMD)), while traditional for loop can't make use of it.

Solution 4 - Python

Numpy arrays are extremily similar to 'normal' arrays such as those in c. Notice that every element has to be of the same type. The speedup is great because you can take advantage of prefetching and you can instantly access any element in array by it's index.

Solution 5 - Python

You still have for loops, but they are done in c. Numpy is based on Atlas, which is a library for linear algebra operations.

http://math-atlas.sourceforge.net/

When facing a big computation, it will run tests using several implementations to find out which is the fastest one on our computer at this moment. With some numpy builds comutations may be parallelized on multiple cpus. So you will have highly optimized c running on continuous memory blocks.

Solution 6 - Python

Numpy arrays are stored in memory as continuous blocks of memory and python lists are stored as small blocks which are scattered in memory so memory access is easy and fast in a numpy array and memory access is difficult and slow in a python list.

source: https://algorithmdotcpp.blogspot.com/2022/01/prove-numpy-is-faster-than-normal-list.html

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAnakeView Question on Stackoverflow
Solution 1 - PythonFred FooView Answer on Stackoverflow
Solution 2 - PythonriffraffView Answer on Stackoverflow
Solution 3 - PythonVinKrishView Answer on Stackoverflow
Solution 4 - PythonScarletAmaranthView Answer on Stackoverflow
Solution 5 - PythonSimon BergotView Answer on Stackoverflow
Solution 6 - PythonDeep GajjarView Answer on Stackoverflow