Checking if all elements in a list are unique

PythonAlgorithmListUnique

Python Problem Overview


What is the best way (best as in the conventional way) of checking whether all elements in a list are unique?

My current approach using a Counter is:

>>> x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
>>> counter = Counter(x)
>>> for values in counter.itervalues():
    	if values > 1: 
            # do something

Can I do better?

Python Solutions


Solution 1 - Python

Not the most efficient, but straight forward and concise:

if len(x) > len(set(x)):
   pass # do something

Probably won't make much of a difference for short lists.

Solution 2 - Python

Here is a two-liner that will also do early exit:

>>> def allUnique(x):
...     seen = set()
...     return not any(i in seen or seen.add(i) for i in x)
...
>>> allUnique("ABCDEF")
True
>>> allUnique("ABACDEF")
False

If the elements of x aren't hashable, then you'll have to resort to using a list for seen:

>>> def allUnique(x):
...     seen = list()
...     return not any(i in seen or seen.append(i) for i in x)
...
>>> allUnique([list("ABC"), list("DEF")])
True
>>> allUnique([list("ABC"), list("DEF"), list("ABC")])
False

Solution 3 - Python

An early-exit solution could be

def unique_values(g):
    s = set()
    for x in g:
        if x in s: return False
        s.add(x)
    return True

however for small cases or if early-exiting is not the common case then I would expect len(x) != len(set(x)) being the fastest method.

Solution 4 - Python

for speed:

import numpy as np
x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
np.unique(x).size == len(x)

Solution 5 - Python

How about adding all the entries to a set and checking its length?

len(set(x)) == len(x)

Solution 6 - Python

Alternative to a set, you can use a dict.

len({}.fromkeys(x)) == len(x)

Solution 7 - Python

Another approach entirely, using sorted and groupby:

from itertools import groupby
is_unique = lambda seq: all(sum(1 for _ in x[1])==1 for x in groupby(sorted(seq)))

It requires a sort, but exits on the first repeated value.

Solution 8 - Python

Here is a recursive O(N2) version for fun:

def is_unique(lst):
    if len(lst) > 1:
        return is_unique(s[1:]) and (s[0] not in s[1:])
    return True

Solution 9 - Python

Here is a recursive early-exit function:

def distinct(L):
    if len(L) == 2:
        return L[0] != L[1]
    H = L[0]
    T = L[1:]
    if (H in T):
            return False
    else:
            return distinct(T)    

It's fast enough for me without using weird(slow) conversions while having a functional-style approach.

Solution 10 - Python

all answer above are good but i prefer to use all_unique example from 30 seconds of python

you need to use set() on the given list to remove duplicates, compare its length with the length of the list.

def all_unique(lst):
  return len(lst) == len(set(lst))

it returns True if all the values in a flat list are unique, False otherwise

x = [1,2,3,4,5,6]
y = [1,2,2,3,4,5]
all_unique(x) # True
all_unique(y) # False

Solution 11 - Python

How about this

def is_unique(lst):
    if not lst:
        return True
    else:
        return Counter(lst).most_common(1)[0][1]==1

Solution 12 - Python

If and only if you have the data processing library pandas in your dependencies, there's an already implemented solution which gives the boolean you want :

import pandas as pd
pd.Series(lst).is_unique

Solution 13 - Python

You can use Yan's syntax (len(x) > len(set(x))), but instead of set(x), define a function:

 def f5(seq, idfun=None): 
    # order preserving
    if idfun is None:
        def idfun(x): return x
    seen = {}
    result = []
    for item in seq:
        marker = idfun(item)
        # in old Python versions:
        # if seen.has_key(marker)
        # but in new ones:
        if marker in seen: continue
        seen[marker] = 1
        result.append(item)
    return result

and do len(x) > len(f5(x)). This will be fast and is also order preserving.

Code there is taken from: http://www.peterbe.com/plog/uniqifiers-benchmark

Solution 14 - Python

Using a similar approach in a Pandas dataframe to test if the contents of a column contains unique values:

if tempDF['var1'].size == tempDF['var1'].unique().size:
    print("Unique")
else:
    print("Not unique")

For me, this is instantaneous on an int variable in a dateframe containing over a million rows.

Solution 15 - Python

It does not fully fit the question but if you google the task I had you get this question ranked first and it might be of interest to the users as it is an extension of the quesiton. If you want to investigate for each list element if it is unique or not you can do the following:

import timeit
import numpy as np

def get_unique(mylist):
    # sort the list and keep the index
    sort = sorted((e,i) for i,e in enumerate(mylist))
    # check for each element if it is similar to the previous or next one    
    isunique = [[sort[0][1],sort[0][0]!=sort[1][0]]] + \
               [[s[1], (s[0]!=sort[i-1][0])and(s[0]!=sort[i+1][0])] 
                for [i,s] in enumerate (sort) if (i>0) and (i<len(sort)-1) ] +\
               [[sort[-1][1],sort[-1][0]!=sort[-2][0]]]     
    # sort indices and booleans and return only the boolean
    return [a[1] for a in sorted(isunique)]


def get_unique_using_count(mylist):
     return [mylist.count(item)==1 for item in mylist]

mylist = list(np.random.randint(0,10,10))
%timeit for x in range(10): get_unique(mylist)
%timeit for x in range(10): get_unique_using_count(mylist)

mylist = list(np.random.randint(0,1000,1000))
%timeit for x in range(10): get_unique(mylist)
%timeit for x in range(10): get_unique_using_count(mylist)

for short lists the get_unique_using_count as suggested in some answers is fast. But if your list is already longer than 100 elements the count function takes quite long. Thus the approach shown in the get_unique function is much faster although it looks more complicated.

Solution 16 - Python

If the list is sorted anyway, you can use:

not any(sorted_list[i] == sorted_list[i + 1] for i in range(len(sorted_list) - 1))

Pretty efficient, but not worth sorting for this purpose though.

Solution 17 - Python

For begginers:

def AllDifferent(s):
    for i in range(len(s)):
        for i2 in range(len(s)):
            if i != i2:
                if s[i] == s[i2]:
                    return False
    return True

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser225312View Question on Stackoverflow
Solution 1 - PythonyanView Answer on Stackoverflow
Solution 2 - PythonPaulMcGView Answer on Stackoverflow
Solution 3 - Python6502View Answer on Stackoverflow
Solution 4 - PythonjassinmView Answer on Stackoverflow
Solution 5 - PythonGrzegorz OledzkiView Answer on Stackoverflow
Solution 6 - PythonTugrul AtesView Answer on Stackoverflow
Solution 7 - PythonPaulMcGView Answer on Stackoverflow
Solution 8 - PythonKarolView Answer on Stackoverflow
Solution 9 - PythonmhourdakisView Answer on Stackoverflow
Solution 10 - PythonArunPratapView Answer on Stackoverflow
Solution 11 - PythonyilmazhuseyinView Answer on Stackoverflow
Solution 12 - PythonTomView Answer on Stackoverflow
Solution 13 - PythoncanisrufusView Answer on Stackoverflow
Solution 14 - Pythonuser1718097View Answer on Stackoverflow
Solution 15 - PythonhorseshoeView Answer on Stackoverflow
Solution 16 - PythonChrisView Answer on Stackoverflow
Solution 17 - PythonDonChrissView Answer on Stackoverflow