How to efficiently compare two unordered lists (not sets) in Python?

PythonAlgorithmListComparison

Python Problem Overview


a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]

a & b should be considered equal, because they have exactly the same elements, only in different order.

The thing is, my actual lists will consist of objects (my class instances), not integers.

Python Solutions


Solution 1 - Python

O(n): The Counter() method is best (if your objects are hashable):

def compare(s, t):
    return Counter(s) == Counter(t)

O(n log n): The sorted() method is next best (if your objects are orderable):

def compare(s, t):
    return sorted(s) == sorted(t)

O(n * n): If the objects are neither hashable, nor orderable, you can use equality:

def compare(s, t):
    t = list(t)   # make a mutable copy
    try:
        for elem in s:
            t.remove(elem)
    except ValueError:
        return False
    return not t

Solution 2 - Python

You can sort both:

sorted(a) == sorted(b)

A counting sort could also be more efficient (but it requires the object to be hashable).

>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True

Solution 3 - Python

If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)

In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)

len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)

Solution 4 - Python

The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:

sorted(a) == sorted(b)

It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.

sorted(a, key=id) == sorted(b, key==id)

(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)

If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.

sorted(a, key=repr) == sorted(b, key==repr)

If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.

Solution 5 - Python

If you have to do this in tests: https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual

assertCountEqual(first, second, msg=None)

Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.

Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.

New in version 3.2.

or in 2.7: https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual

Outside of tests I would recommend the Counter method.

Solution 6 - Python

If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).

Works on sequences of unhashable objects too.

Solution 7 - Python

If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:

from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
    print("Lists a and b contain the same objects")

Solution 8 - Python

Let a,b lists

def ass_equal(a,b):
try:
    map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
    if len(a) == 0: # if a is empty, means that b has removed them all
        return True 
except:
    return False # b failed to remove some items from a

No need to make them hashable or sort them.

Solution 9 - Python

I hope the below piece of code might work in your case :-

if ((len(a) == len(b)) and
   (all(i in a for i in b))):
    print 'True'
else:
    print 'False'

This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.

For better understanding, refer to my answer in this question

Solution 10 - Python

You can write your own function to compare the lists.

Let's get two lists.

list_1=['John', 'Doe'] 
list_2=['Doe','Joe']

Firstly, we define an empty dictionary, count the list items and write in the dictionary.

def count_list(list_items):
    empty_dict={}
    for list_item in list_items:
        list_item=list_item.strip()
        if list_item not in empty_dict:
            empty_dict[list_item]=1
        else:
            empty_dict[list_item]+=1
    return empty_dict


        

After that, we'll compare both lists by using the following function.

def compare_list(list_1, list_2):
    if count_list(list_1)==count_list(list_2):
        return True
    return False
compare_list(list_1,list_2)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjohndirView Question on Stackoverflow
Solution 1 - PythonRaymond HettingerView Answer on Stackoverflow
Solution 2 - PythonMark ByersView Answer on Stackoverflow
Solution 3 - PythonJohn La RooyView Answer on Stackoverflow
Solution 4 - PythonkindallView Answer on Stackoverflow
Solution 5 - PythonclederView Answer on Stackoverflow
Solution 6 - PythonjarekwgView Answer on Stackoverflow
Solution 7 - PythonMarsView Answer on Stackoverflow
Solution 8 - PythonUmur KontacıView Answer on Stackoverflow
Solution 9 - PythonPabitra PatiView Answer on Stackoverflow
Solution 10 - PythonŞahin Murat OğurView Answer on Stackoverflow