Find the unique values in a column and then sort them

PythonPandasSortingDataframeUnique

Python Problem Overview


I have a pandas dataframe. I want to print the unique values of one of its columns in ascending order. This is how I am doing it:

import pandas as pd
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
print a.sort()

The problem is that I am getting a None for the output.

Python Solutions


Solution 1 - Python

sorted(iterable): Return a new sorted list from the items in iterable.

CODE

import pandas as pd
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
print(sorted(a))

OUTPUT

[1, 2, 3, 6, 8]

Solution 2 - Python

sort sorts inplace so returns nothing:

In [54]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
a

Out[54]:
array([1, 2, 3, 6, 8], dtype=int64)

So you have to call print a again after the call to sort.

Eg.:

In [55]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
print(a)

[1 2 3 6 8]

Solution 3 - Python

You can also use the drop_duplicates() instead of unique()

df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].drop_duplicates()
a.sort()
print a

Solution 4 - Python

I prefer the oneliner:

print(sorted(df['Column Name'].unique()))

Solution 5 - Python

Came across the question myself today. I think the reason that your code returns 'None' (exactly what I got by using the same method) is that

a.sort()

is calling the sort function to mutate the list a. In my understanding, this is a modification command. To see the result you have to use print(a).

My solution, as I tried to keep everything in pandas:

pd.Series(df['A'].unique()).sort_values()

Solution 6 - Python

Fastest code

for large data frames:

df['A'].drop_duplicates().sort_values()

Solution 7 - Python

I would suggest using numpy's sort, as it is anyway what pandas is doing in background:

import numpy as np
np.sort(df.A.unique())

But doing all in pandas is valid as well.

Solution 8 - Python

Another way is using set data type.

Some characteristic of Sets: Sets are unordered, can include mixed data types, elements in a set cannot be repeated, are mutable.

Solving your question:

df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
sorted(set(df.A))

The answer in List type:

[1, 2, 3, 6, 8]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMASView Question on Stackoverflow
Solution 1 - PythonVineet Kumar DoshiView Answer on Stackoverflow
Solution 2 - PythonEdChumView Answer on Stackoverflow
Solution 3 - PythonMelounView Answer on Stackoverflow
Solution 4 - PythonMDMoore313View Answer on Stackoverflow
Solution 5 - PythonBowen LiuView Answer on Stackoverflow
Solution 6 - PythonSerge StroobandtView Answer on Stackoverflow
Solution 7 - PythonChallensoisView Answer on Stackoverflow
Solution 8 - PythonIvan Carrasco QuirozView Answer on Stackoverflow