How to filter in NaN (pandas)?

PythonPandasNan

Python Problem Overview


I have a pandas dataframe (df), and I want to do something like:

newdf = df[(df.var1 == 'a') & (df.var2 == NaN)]

I've tried replacing NaN with np.NaN, or 'NaN' or 'nan' etc, but nothing evaluates to True. There's no pd.NaN.

I can use df.fillna(np.nan) before evaluating the above expression but that feels hackish and I wonder if it will interfere with other pandas operations that rely on being able to identify pandas-format NaN's later.

I get the feeling there should be an easy answer to this question, but somehow it has eluded me. Any advice is appreciated. Thank you.

Python Solutions


Solution 1 - Python

Simplest of all solutions:

filtered_df = df[df['var2'].isnull()]

This filters and gives you rows which has only NaN values in 'var2' column.

Solution 2 - Python

This doesn't work because NaN isn't equal to anything, including NaN. Use pd.isnull(df.var2) instead.

Solution 3 - Python

df[df['var'].isna()]

where

df  : The DataFrame
var : The Column Name

Solution 4 - Python

Pandas uses numpy's NaN value. Use numpy.isnan to obtain a Boolean vector from a pandas series.

Solution 5 - Python

You can also use query here:

df.query('var2 != var2')

This works since np.nan != np.nan.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGerhardView Question on Stackoverflow
Solution 1 - PythonGil BaggioView Answer on Stackoverflow
Solution 2 - PythonMark WhitfieldView Answer on Stackoverflow
Solution 3 - PythonMohammad ShalabyView Answer on Stackoverflow
Solution 4 - PythonNicholasMView Answer on Stackoverflow
Solution 5 - PythonrachwaView Answer on Stackoverflow