better way to drop nan rows in pandas
PythonPandasPython Problem Overview
On my own I found a way to drop nan rows from a pandas dataframe. Given a dataframe dat
with column x
which contains nan values,is there a more elegant way to do drop each row of dat
which has a nan value in the x
column?
dat = dat[np.logical_not(np.isnan(dat.x))]
dat = dat.reset_index(drop=True)
Python Solutions
Solution 1 - Python
Use dropna:
dat.dropna()
You can pass param how
to drop if all labels are nan or any of the labels are nan
dat.dropna(how='any') #to drop if any value in the row has a nan
dat.dropna(how='all') #to drop if all values in the row are nan
Hope that answers your question!
Edit 1:
In case you want to drop rows containing nan
values only from particular column(s), as suggested by J. Doe in his answer below, you can use the following:
dat.dropna(subset=[col_list]) # col_list is a list of column names to consider for nan values.
Solution 2 - Python
To expand Hitesh's answer if you want to drop rows where 'x' specifically is nan, you can use the subset parameter. His answer will drop rows where other columns have nans as well
dat.dropna(subset=['x'])
Solution 3 - Python
Just in case commands in previous answers doesn't work,
Try this:
dat.dropna(subset=['x'], inplace = True)
Solution 4 - Python
bool_series=pd.notnull(dat["x"])
dat=dat[bool_series]
Solution 5 - Python
To remove rows based on Nan value of particular column:
d= pd.DataFrame([[2,3],[4,None]]) #creating data frame
d
Output:
0 1
0 2 3.0
1 4 NaN
d = d[np.isfinite(d[1])] #Select rows where value of 1st column is not nan
d
Output:
0 1
0 2 3.0