Pandas - Filtering None Values

PythonPandasDataframe

Python Problem Overview


I'm using Pandas to explore some datasets. I have this dataframe:

enter image description here

I want to exclude any row that has a city value. So I've tried:

new_df = all_df[(all_df["City"] == "None") ]
new_df

But then I got an empty dataframe:

enter image description here

It works whenever I use any value other than None. Any idea how to filter this dataframe?

Python Solutions


Solution 1 - Python

Consider using isnull() to locate missing values

all_df[all_df['City'].isnull()]

Solution 2 - Python

Try this to select only the None values for city column:

new_df = all_df['City'][all_df['City'] == "None"]

Try this to see all other columns which has the same rows of 'City'==None

new_df = all_df[all_df['City'] == "None"]
print(new_df.head()) # with function head() you can see the first 5 rows

Solution 3 - Python

I hope "where" can do what you expect

new_df = new_df.where(new_df["city"], None) 

And it is better use np.nan rather than None.

For more details pandas.DataFrame.where

Solution 4 - Python

If it's desired to filter multiple rows with None values, we could use any, all or sum. For example, for df given below:

   FACTS_Value    Region      City  Village
0        16482  Al Bahah      None     None
1        22522  Al Bahah   Al Aqiq     None
2        12444  Al Bahah   Al Aqiq  Al Aqiq
3        12823  Al Bahah  Al Bahah  Al Aqiq
4        11874      None      None     None

If we want to select all rows with None value in at least one column, we could use isna + any on axis to build a boolean mask:

msk = df.isna().any(axis=1)
out = df[msk]

Output:

   FACTS_Value    Region     City Village
0        16482  Al Bahah     None    None
1        22522  Al Bahah  Al Aqiq    None
4        11874      None     None    None

If we want the rows where all non-numeric column values are None, then we could use isna + all on axis:

msk = df.select_dtypes(exclude='number').isna().all(axis=1)

or

msk = df[['Region', 'City', 'Village']].isna().all(axis=1)
out = df[msk]

Output:

   FACTS_Value Region  City Village
4        11874   None  None    None

If we want to filter rows where there are exactly n None values, then we could use sum on axis + eq:

msk = df.isna().sum(axis=1) == 2
out = df[msk]

Output:

   FACTS_Value    Region  City Village
0        16482  Al Bahah  None    None

Solution 5 - Python

Another alternative is to use the query method:

In [3]: all_df.query('City != City')
Out[3]: 
   FACTS_Value    Region  City Village
0      34135.0  Al Bahah  None    None

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionShadinView Question on Stackoverflow
Solution 1 - PythontarashypkaView Answer on Stackoverflow
Solution 2 - PythonimanzabetView Answer on Stackoverflow
Solution 3 - PythonVibhutha KumarageView Answer on Stackoverflow
Solution 4 - Pythonuser7864386View Answer on Stackoverflow
Solution 5 - PythonrachwaView Answer on Stackoverflow