Use a list of values to select rows from a Pandas dataframe
PythonPandasDataframePython Problem Overview
Let’s say I have the following Pandas dataframe:
df = DataFrame({'A' : [5,6,3,4], 'B' : [1,2,3, 5]})
df
A B
0 5 1
1 6 2
2 3 3
3 4 5
I can subset based on a specific value:
x = df[df['A'] == 3]
x
A B
2 3 3
But how can I subset based on a list of values? - something like this:
list_of_values = [3,6]
y = df[df['A'] in list_of_values]
To get:
A B
1 6 2
2 3 3
Python Solutions
Solution 1 - Python
You can use the isin
method:
In [1]: df = pd.DataFrame({'A': [5,6,3,4], 'B': [1,2,3,5]})
In [2]: df
Out[2]:
A B
0 5 1
1 6 2
2 3 3
3 4 5
In [3]: df[df['A'].isin([3, 6])]
Out[3]:
A B
1 6 2
2 3 3
And to get the opposite use ~
:
In [4]: df[~df['A'].isin([3, 6])]
Out[4]:
A B
0 5 1
3 4 5
Solution 2 - Python
You can use the method query:
df.query('A in [6, 3]')
# df.query('A == [6, 3]')
or
lst = [6, 3]
df.query('A in @lst')
# df.query('A == @lst')
Solution 3 - Python
Another method;
df.loc[df.apply(lambda x: x.A in [3,6], axis=1)]
Unlike the isin method, this is particularly useful in determining if the list contains a function of the column A
. For example, f(A) = 2*A - 5
as the function;
df.loc[df.apply(lambda x: 2*x.A-5 in [3,6], axis=1)]
It should be noted that this approach is slower than the isin
method.