Numpy: find index of the elements within range

PythonNumpy

Python Problem Overview


I have a numpy array of numbers, for example,

a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])  

I would like to find all the indexes of the elements within a specific range. For instance, if the range is (6, 10), the answer should be (3, 4, 5). Is there a built-in function to do this?

Python Solutions


Solution 1 - Python

You can use np.where to get indices and np.logical_and to set two conditions:

import numpy as np
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])

np.where(np.logical_and(a>=6, a<=10))
# returns (array([3, 4, 5]),)

Solution 2 - Python

As in @deinonychusaur's reply, but even more compact:

In [7]: np.where((a >= 6) & (a <=10))
Out[7]: (array([3, 4, 5]),)

Solution 3 - Python

Summary of the answers

For understanding what is the best answer we can do some timing using the different solution. Unfortunately, the question was not well-posed so there are answers to different questions, here I try to point the answer to the same question. Given the array:

a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])

The answer should be the indexes of the elements between a certain range, we assume inclusive, in this case, 6 and 10.

answer = (3, 4, 5)

Corresponding to the values 6,9,10.

To test the best answer we can use this code.

import timeit
setup = """
import numpy as np
import numexpr as ne

a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
# or test it with an array of the similar size
# a = np.random.rand(100)*23 # change the number to the an estimate of your array size.

# we define the left and right limit
ll = 6
rl = 10

def sorted_slice(a,l,r):
    start = np.searchsorted(a, l, 'left')
    end = np.searchsorted(a, r, 'right')
    return np.arange(start,end)
"""

functions = ['sorted_slice(a,ll,rl)', # works only for sorted values
'np.where(np.logical_and(a>=ll, a<=rl))[0]',
'np.where((a >= ll) & (a <=rl))[0]',
'np.where((a>=ll)*(a<=rl))[0]',
'np.where(np.vectorize(lambda x: ll <= x <= rl)(a))[0]',
'np.argwhere((a>=ll) & (a<=rl)).T[0]', # we traspose for getting a single row
'np.where(ne.evaluate("(ll <= a) & (a <= rl)"))[0]',]

functions2 = [
   'a[np.logical_and(a>=ll, a<=rl)]',
   'a[(a>=ll) & (a<=rl)]',
   'a[(a>=ll)*(a<=rl)]',
   'a[np.vectorize(lambda x: ll <= x <= rl)(a)]',
   'a[ne.evaluate("(ll <= a) & (a <= rl)")]',
]

rdict = {}
for i in functions:
    rdict[i] = timeit.timeit(i,setup=setup,number=1000)
    print("%s -> %s s" %(i,rdict[i]))

print("Sorted:")
for w in sorted(rdict, key=rdict.get):
    print(w, rdict[w])

Results

The results are reported in the following plot for a small array (on the top the fastest solution) as noted by @EZLearner they may vary depending on the size of the array. sorted slice could be faster for larger arrays, but it requires your array to be sorted, for arrays with over 10 M of entries ne.evaluate could be an option. Is hence always better to perform this test with an array of the same size as yours: enter image description here

If instead of the indexes you want to extract the values you can perform the tests using functions2 but the results are almost the same.

Solution 4 - Python

I thought I would add this because the a in the example you gave is sorted:

import numpy as np
a = [1, 3, 5, 6, 9, 10, 14, 15, 56] 
start = np.searchsorted(a, 6, 'left')
end = np.searchsorted(a, 10, 'right')
rng = np.arange(start, end)
rng
# array([3, 4, 5])

Solution 5 - Python

a = np.array([1,2,3,4,5,6,7,8,9])
b = a[(a>2) & (a<8)]

Solution 6 - Python

This code snippet returns all the numbers in a numpy array between two values:

a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56] )
a[(a>6)*(a<10)]

It works as following: (a>6) returns a numpy array with True (1) and False (0), so does (a<10). By multiplying these two together you get an array with either a True, if both statements are True (because 1x1 = 1) or False (because 0x0 = 0 and 1x0 = 0).

The part a[...] returns all values of array a where the array between brackets returns a True statement.

Of course you can make this more complicated by saying for instance

...*(1-a<10) 

which is similar to an "and Not" statement.

Solution 7 - Python

Other way is with:

np.vectorize(lambda x: 6 <= x <= 10)(a)

which returns:

array([False, False, False,  True,  True,  True, False, False, False])

It is sometimes useful for masking time series, vectors, etc.

Solution 8 - Python

a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.argwhere((a>=6) & (a<=10))

Solution 9 - Python

Wanted to add numexpr into the mix:

import numpy as np
import numexpr as ne

a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])  

np.where(ne.evaluate("(6 <= a) & (a <= 10)"))[0]
# array([3, 4, 5], dtype=int64)

Would only make sense for larger arrays with millions... or if you hitting a memory limits.

Solution 10 - Python

This may not be the prettiest, but works for any dimension

a = np.array([[-1,2], [1,5], [6,7], [5,2], [3,4], [0, 0], [-1,-1]])
ranges = (0,4), (0,4) 

def conditionRange(X : np.ndarray, ranges : list) -> np.ndarray:
    idx = set()
    for column, r in enumerate(ranges):
        tmp = np.where(np.logical_and(X[:, column] >= r[0], X[:, column] <= r[1]))[0]
        if idx:
            idx = idx & set(tmp)
        else:
            idx = set(tmp)
    idx = np.array(list(idx))
    return X[idx, :]

b = conditionRange(a, ranges)
print(b)

Solution 11 - Python

s=[52, 33, 70, 39, 57, 59, 7, 2, 46, 69, 11, 74, 58, 60, 63, 43, 75, 92, 65, 19, 1, 79, 22, 38, 26, 3, 66, 88, 9, 15, 28, 44, 67, 87, 21, 49, 85, 32, 89, 77, 47, 93, 35, 12, 73, 76, 50, 45, 5, 29, 97, 94, 95, 56, 48, 71, 54, 55, 51, 23, 84, 80, 62, 30, 13, 34]

dic={}

for i in range(0,len(s),10):
	dic[i,i+10]=list(filter(lambda x:((x>=i)&(x<i+10)),s))
print(dic)

for keys,values in dic.items():
    print(keys)
    print(values)

Output:

(0, 10)
[7, 2, 1, 3, 9, 5]
(20, 30)
[22, 26, 28, 21, 29, 23]
(30, 40)
[33, 39, 38, 32, 35, 30, 34]
(10, 20)
[11, 19, 15, 12, 13]
(40, 50)
[46, 43, 44, 49, 47, 45, 48]
(60, 70)
[69, 60, 63, 65, 66, 67, 62]
(50, 60)
[52, 57, 59, 58, 50, 56, 54, 55, 51]  

Solution 12 - Python

You can use np.clip() to achieve the same:

a = [1, 3, 5, 6, 9, 10, 14, 15, 56]  
np.clip(a,6,10)

However, it holds the values less than and greater than 6 and 10 respectively.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBobView Question on Stackoverflow
Solution 1 - PythondeinonychusaurView Answer on Stackoverflow
Solution 2 - PythontiagoView Answer on Stackoverflow
Solution 3 - PythonG MView Answer on Stackoverflow
Solution 4 - PythonBi RicoView Answer on Stackoverflow
Solution 5 - PythonAbhishekView Answer on Stackoverflow
Solution 6 - PythonNathanView Answer on Stackoverflow
Solution 7 - PythonJorge CrvzView Answer on Stackoverflow
Solution 8 - PythonAbhishekView Answer on Stackoverflow
Solution 9 - Pythonk1m190rView Answer on Stackoverflow
Solution 10 - PythoncvanelterenView Answer on Stackoverflow
Solution 11 - PythonVigneshwaran NarayananView Answer on Stackoverflow
Solution 12 - PythonDeba Pratim SahaView Answer on Stackoverflow