How can I check for NaN values?
PythonMathPython Problem Overview
float('nan')
represents NaN (not a number). But how do I check for it?
Python Solutions
Solution 1 - Python
Use math.isnan
:
>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True
Solution 2 - Python
The usual way to test for a NaN is to see if it's equal to itself:
def isNaN(num):
return num != num
Solution 3 - Python
numpy.isnan(number)
tells you if it's NaN
or not.
Solution 4 - Python
Here are three ways where you can test a variable is "NaN" or not.
import pandas as pd
import numpy as np
import math
# For single variable all three libraries return single boolean
x1 = float("nan")
print(f"It's pd.isna: {pd.isna(x1)}")
print(f"It's np.isnan: {np.isnan(x1)}}")
print(f"It's math.isnan: {math.isnan(x1)}}")
Output It's pd.isna: True It's np.isnan: True It's math.isnan: True
Solution 5 - Python
here is an answer working with:
- NaN implementations respecting IEEE 754 standard
- ie: python's NaN:
float('nan')
,numpy.nan
...
- ie: python's NaN:
- any other objects: string or whatever (does not raise exceptions if encountered)
A NaN implemented following the standard, is the only value for which the inequality comparison with itself should return True:
def is_nan(x):
return (x != x)
And some examples:
import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
print(f"{repr(value):<8} : {is_nan(value)}")
Output:
nan : True
nan : True
55 : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False
Solution 6 - Python
It seems that checking if it's equal to itself
x!=x
is the fastest.
import pandas as pd
import numpy as np
import math
x = float('nan')
%timeit x!=x
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit math.isnan(x)
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit pd.isna(x)
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit np.isnan(x)
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Solution 7 - Python
I actually just ran into this, but for me it was checking for nan, -inf, or inf. I just used
if float('-inf') < float(num) < float('inf'):
This is true for numbers, false for nan and both inf, and will raise an exception for things like strings or other types (which is probably a good thing). Also this does not require importing any libraries like math or numpy (numpy is so damn big it doubles the size of any compiled application).
Solution 8 - Python
or compare the number to itself. NaN is always != NaN, otherwise (e.g. if it is a number) the comparison should succeed.
Solution 9 - Python
Well I entered this post, because i've had some issues with the function:
math.isnan()
There are problem when you run this code:
a = "hello"
math.isnan(a)
It raises exception. My solution for that is to make another check:
def is_nan(x):
return isinstance(x, float) and math.isnan(x)
Solution 10 - Python
Another method if you're stuck on <2.6, you don't have numpy, and you don't have IEEE 754 support:
def isNaN(x):
return str(x) == str(1e400*0)
Solution 11 - Python
With python < 2.6 I ended up with
def isNaN(x):
return str(float(x)).lower() == 'nan'
This works for me with python 2.5.1 on a Solaris 5.9 box and with python 2.6.5 on Ubuntu 10
Solution 12 - Python
I am receiving the data from a web-service that sends NaN
as a string 'Nan'
. But there could be other sorts of string in my data as well, so a simple float(value)
could throw an exception. I used the following variant of the accepted answer:
def isnan(value):
try:
import math
return math.isnan(float(value))
except:
return False
Requirement:
isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True
Solution 13 - Python
All the methods to tell if the variable is NaN or None:
None type
In [1]: from numpy import math
In [2]: a = None
In [3]: not a
Out[3]: True
In [4]: len(a or ()) == 0
Out[4]: True
In [5]: a == None
Out[5]: True
In [6]: a is None
Out[6]: True
In [7]: a != a
Out[7]: False
In [9]: math.isnan(a)
Traceback (most recent call last):
File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
math.isnan(a)
TypeError: a float is required
In [10]: len(a) == 0
Traceback (most recent call last):
File "<ipython-input-10-65b72372873e>", line 1, in <module>
len(a) == 0
TypeError: object of type 'NoneType' has no len()
NaN type
In [11]: b = float('nan')
In [12]: b
Out[12]: nan
In [13]: not b
Out[13]: False
In [14]: b != b
Out[14]: True
In [15]: math.isnan(b)
Out[15]: True
Solution 14 - Python
How to remove NaN (float) item(s) from a list of mixed data types
If you have mixed types in an iterable, here is a solution that does not use numpy:
from math import isnan
Z = ['a','b', float('NaN'), 'd', float('1.1024')]
[x for x in Z if not (
type(x) == float # let's drop all float values…
and isnan(x) # … but only if they are nan
)]
['a', 'b', 'd', 1.1024]
Short-circuit evaluation means that isnan
will not be called on values that are not of type 'float', as False and (…)
quickly evaluates to False
without having to evaluate the right-hand side.
Solution 15 - Python
In Python 3.6 checking on a string value x math.isnan(x) and np.isnan(x) raises an error. So I can't check if the given value is NaN or not if I don't know beforehand it's a number. The following seems to solve this issue
if str(x)=='nan' and type(x)!='str':
print ('NaN')
else:
print ('non NaN')
Solution 16 - Python
Comparison pd.isna
, math.isnan
and np.isnan
and their flexibility dealing with different type of objects.
The table below shows if the type of object can be checked with the given method:
+------------+-----+---------+------+--------+------+
| Method | NaN | numeric | None | string | list |
+------------+-----+---------+------+--------+------+
| pd.isna | yes | yes | yes | yes | yes |
| math.isnan | yes | yes | no | no | no |
| np.isnan | yes | yes | no | no | yes | <-- # will error on mixed type list
+------------+-----+---------+------+--------+------+
pd.isna
The most flexible method to check for different types of missing values.
None of the answers cover the flexibility of pd.isna
. While math.isnan
and np.isnan
will return True
for NaN
values, you cannot check for different type of objects like None
or strings. Both methods will return an error, so checking a list with mixed types will be cumbersom. This while pd.isna
is flexible and will return the correct boolean for different kind of types:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: missing_values = [3, None, np.NaN, pd.NA, pd.NaT, '10']
In [4]: pd.isna(missing_values)
Out[4]: array([False, True, True, True, True, False])
Solution 17 - Python
> For nan of type float
>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False
Solution 18 - Python
for strings in panda take pd.isnull:
if not pd.isnull(atext):
for word in nltk.word_tokenize(atext):
the function as feature extraction for NLTK
def act_features(atext):
features = {}
if not pd.isnull(atext):
for word in nltk.word_tokenize(atext):
if word not in default_stopwords:
features['cont({})'.format(word.lower())]=True
return features