dropping rows from dataframe based on a "not in" condition

PythonPandas

Python Problem Overview


I want to drop rows from a pandas dataframe when the value of the date column is in a list of dates. The following code doesn't work:

a=['2015-01-01' , '2015-02-01']

df=df[df.datecolumn not in a]

I get the following error:

> ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Python Solutions


Solution 1 - Python

You can use pandas.Dataframe.isin.

pandas.Dateframe.isin will return boolean values depending on whether each element is inside the list a or not. You then invert this with the ~ to convert True to False and vice versa.

import pandas as pd

a = ['2015-01-01' , '2015-02-01']

df = pd.DataFrame(data={'date':['2015-01-01' , '2015-02-01', '2015-03-01' , '2015-04-01', '2015-05-01' , '2015-06-01']})

print(df)
#         date
#0  2015-01-01
#1  2015-02-01
#2  2015-03-01
#3  2015-04-01
#4  2015-05-01
#5  2015-06-01

df = df[~df['date'].isin(a)]

print(df)
#         date
#2  2015-03-01
#3  2015-04-01
#4  2015-05-01
#5  2015-06-01

Solution 2 - Python

You can use Series.isin:

df = df[~df.datecolumn.isin(a)]

While the error message suggests that all() or any() can be used, they are useful only when you want to reduce the result into a single Boolean value. That is however not what you are trying to do now, which is to test the membership of every values in the Series against the external list, and keep the results intact (i.e., a Boolean Series which will then be used to slice the original DataFrame).

You can read more about this in the Gotchas.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questiongaurav gurnaniView Question on Stackoverflow
Solution 1 - PythonFfisegyddView Answer on Stackoverflow
Solution 2 - PythonYS-LView Answer on Stackoverflow