How to sort and remove duplicates from Python list?

PythonListSortingUnique

Python Problem Overview


Given a list of strings, I want to sort it alphabetically and remove duplicates. I know I can do this:

from sets import Set
[...]
myHash = Set(myList)

but I don't know how to retrieve the list members from the hash in alphabetical order.

I'm not married to the hash, so any way to accomplish this will work. Also, performance is not an issue, so I'd prefer a solution that is expressed in code clearly to a fast but more opaque one.

Python Solutions


Solution 1 - Python

A list can be sorted and deduplicated using built-in functions:

myList = sorted(set(myList))
  • set is a built-in function for Python >= 2.3
  • sorted is a built-in function for Python >= 2.4

Solution 2 - Python

If your input is already sorted, then there may be a simpler way to do it:

from operator import itemgetter
from itertools import groupby
unique_list = list(map(itemgetter(0), groupby(yourList)))

Solution 3 - Python

If you want to keep order of the original list, just use OrderedDict with None as values.

In Python2:

    from collections import OrderedDict
    from itertools import izip, repeat

    unique_list = list(OrderedDict(izip(my_list, repeat(None))))

In Python3 it's even simpler:

    from collections import OrderedDict
    from itertools import repeat

    unique_list = list(OrderedDict(zip(my_list, repeat(None))))

If you don't like iterators (zip and repeat) you can use a generator (works both in 2 & 3):

    from collections import OrderedDict
    unique_list = list(OrderedDict((element, None) for element in my_list))

Solution 4 - Python

If it's clarity you're after, rather than speed, I think this is very clear:

def sortAndUniq(input):
  output = []
  for x in input:
    if x not in output:
      output.append(x)
  output.sort()
  return output

It's O(n^2) though, with the repeated use of not in for each element of the input list.

Solution 5 - Python

> but I don't know how to retrieve the list members from the hash in alphabetical order.

Not really your main question, but for future reference Rod's answer using sorted can be used for traversing a dict's keys in sorted order:

for key in sorted(my_dict.keys()):
   print key, my_dict[key]
   ...

and also because tuple's are ordered by the first member of the tuple, you can do the same with items:

for key, val in sorted(my_dict.items()):
    print key, val
    ...

Solution 6 - Python

For the string data

 output = []

     def uniq(input):
         if input not in output:
            output.append(input)
 print output     

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJosh GloverView Question on Stackoverflow
Solution 1 - PythonRod DaunoraviciusView Answer on Stackoverflow
Solution 2 - PythonsykoraView Answer on Stackoverflow
Solution 3 - PythonPaweł SobkowiakView Answer on Stackoverflow
Solution 4 - PythonunwindView Answer on Stackoverflow
Solution 5 - PythondavidavrView Answer on Stackoverflow
Solution 6 - Pythonuser2515605View Answer on Stackoverflow