Count number of non-NaN entries in every column of Dataframe
PythonPandasDataframeCountNanPython Problem Overview
I have a really big DataFrame and I was wondering if there was short (one or two liner) way to get the a count of non-NaN entries in a DataFrame. I don't want to do this one column at a time as I have close to 1000 columns.
df1 = pd.DataFrame([(1,2,None),(None,4,None),(5,None,7),(5,None,None)],
columns=['a','b','d'], index = ['A', 'B','C','D'])
a b d
A 1 2 NaN
B NaN 4 NaN
C 5 NaN 7
D 5 NaN NaN
Output:
a: 3
b: 2
d: 1
Python Solutions
Solution 1 - Python
The count()
method returns the number of non-NaN
values in each column:
>>> df1.count()
a 3
b 2
d 1
dtype: int64
Similarly, count(axis=1)
returns the number of non-NaN
values in each row.
Solution 2 - Python
If you want to sum the total count values which are not NAN, one can do;
np.sum(df.count())
Solution 3 - Python
In case you are dealing with empty strings you may want to count them as NA as well :
df.replace('', np.nan).count()
or if you also want to remove blank strings :
df.replace(r'^\s*$', np.nan, regex=True).count()