How do I remove all zero elements from a NumPy array?

PythonArraysNumpyFiltering

Python Problem Overview


I have a rank-1 numpy.array of which I want to make a boxplot. However, I want to exclude all values equal to zero in the array. Currently, I solved this by looping the array and copy the value to a new array if not equal to zero. However, as the array consists of 86 000 000 values and I have to do this multiple times, this takes a lot of patience.

Is there a more intelligent way to do this?

Python Solutions


Solution 1 - Python

For a NumPy array a, you can use

a[a != 0]

to extract the values not equal to zero.

Solution 2 - Python

This is a case where you want to use masked arrays, it keeps the shape of your array and it is automatically recognized by all numpy and matplotlib functions.

X = np.random.randn(1e3, 5)
X[np.abs(X)< .1]= 0 # some zeros
X = np.ma.masked_equal(X,0)
plt.boxplot(X) #masked values are not plotted

#other functionalities of masked arrays
X.compressed() # get normal array with masked values removed
X.mask # get a boolean array of the mask
X.mean() # it automatically discards masked values

Solution 3 - Python

I decided to compare the runtime of the different approaches mentioned here. I've used my library simple_benchmark for this.

The boolean indexing with array[array != 0] seems to be the fastest (and shortest) solution.

enter image description here

For smaller arrays the MaskedArray approach is very slow compared to the other approaches however is as fast as the boolean indexing approach. However for moderately sized arrays there is not much difference between them.

Here is the code I've used:

from simple_benchmark import BenchmarkBuilder

import numpy as np

bench = BenchmarkBuilder()

@bench.add_function()
def boolean_indexing(arr):
    return arr[arr != 0]

@bench.add_function()
def integer_indexing_nonzero(arr):
    return arr[np.nonzero(arr)]

@bench.add_function()
def integer_indexing_where(arr):
    return arr[np.where(arr != 0)]

@bench.add_function()
def masked_array(arr):
    return np.ma.masked_equal(arr, 0)

@bench.add_arguments('array size')
def argument_provider():
    for exp in range(3, 25):
        size = 2**exp
        arr = np.random.random(size)
        arr[arr < 0.1] = 0  # add some zeros
        yield size, arr
        
r = bench.run()
r.plot()

Solution 4 - Python

You can index with a Boolean array. For a NumPy array A:

res = A[A != 0]

You can use Boolean array indexing as above, bool type conversion, np.nonzero, or np.where. Here's some performance benchmarking:

# Python 3.7, NumPy 1.14.3

np.random.seed(0)

A = np.random.randint(0, 5, 10**8)

%timeit A[A != 0]          # 768 ms
%timeit A[A.astype(bool)]  # 781 ms
%timeit A[np.nonzero(A)]   # 1.49 s
%timeit A[np.where(A)]     # 1.58 s

Solution 5 - Python

I would like to suggest you to simply utilize NaN for cases like this, where you'll like to ignore some values, but still want to keep the procedure statistical as meaningful as possible. So

In []: X= randn(1e3, 5)
In []: X[abs(X)< .1]= NaN
In []: isnan(X).sum(0)
Out[: array([82, 84, 71, 81, 73])
In []: boxplot(X)

enter image description here

Solution 6 - Python

A simple line of code can get you an array that excludes all '0' values:

np.argwhere(*array*)

example:

import numpy as np
array = [0, 1, 0, 3, 4, 5, 0]
array2 = np.argwhere(array)
print array2

[1, 3, 4, 5]

Solution 7 - Python

[i for i in Array if i != 0.0] if the numbers are float or [i for i in SICER if i != 0] if the numbers are int.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionruben baetensView Question on Stackoverflow
Solution 1 - PythonSven MarnachView Answer on Stackoverflow
Solution 2 - PythonAndrea ZoncaView Answer on Stackoverflow
Solution 3 - PythonMSeifertView Answer on Stackoverflow
Solution 4 - PythonjppView Answer on Stackoverflow
Solution 5 - PythoneatView Answer on Stackoverflow
Solution 6 - PythonDavid GuestView Answer on Stackoverflow
Solution 7 - PythonShrmView Answer on Stackoverflow