How to sort two lists (which reference each other) in the exact same way

PythonListSorting

Python Problem Overview


Say I have two lists:

list1 = [3, 2, 4, 1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

If I run list1.sort(), it'll sort it to [1,1,2,3,4] but is there a way to get list2 in sync as well (so I can say item 4 belongs to 'three')? So, the expected output would be:

list1 = [1, 1, 2, 3, 4]
list2 = ['one', 'one2', 'two', 'three', 'four']

My problem is I have a pretty complex program that is working fine with lists but I sort of need to start referencing some data. I know this is a perfect situation for dictionaries but I'm trying to avoid dictionaries in my processing because I do need to sort the key values (if I must use dictionaries I know how to use them).

Basically the nature of this program is, the data comes in a random order (like above), I need to sort it, process it and then send out the results (order doesn't matter but users need to know which result belongs to which key). I thought about putting it in a dictionary first, then sorting list one but I would have no way of differentiating of items in the with the same value if order is not maintained (it may have an impact when communicating the results to users). So ideally, once I get the lists I would rather figure out a way to sort both lists together. Is this possible?

Python Solutions


Solution 1 - Python

One classic approach to this problem is to use the "decorate, sort, undecorate" idiom, which is especially simple using python's built-in zip function:

>>> list1 = [3,2,4,1, 1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> list1, list2 = zip(*sorted(zip(list1, list2)))
>>> list1
(1, 1, 2, 3, 4)
>>> list2 
('one', 'one2', 'two', 'three', 'four')

These of course are no longer lists, but that's easily remedied, if it matters:

>>> list1, list2 = (list(t) for t in zip(*sorted(zip(list1, list2))))
>>> list1
[1, 1, 2, 3, 4]
>>> list2
['one', 'one2', 'two', 'three', 'four']

It's worth noting that the above may sacrifice speed for terseness; the in-place version, which takes up 3 lines, is a tad faster on my machine for small lists:

>>> %timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 3.3 us per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best of 3: 2.84 us per loop

On the other hand, for larger lists, the one-line version could be faster:

>>> %timeit zip(*sorted(zip(list1, list2)))
100 loops, best of 3: 8.09 ms per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100 loops, best of 3: 8.51 ms per loop

As Quantum7 points out, JSF's suggestion is a bit faster still, but it will probably only ever be a little bit faster, because Python uses the very same DSU idiom internally for all key-based sorts. It's just happening a little closer to the bare metal. (This shows just how well optimized the zip routines are!)

I think the zip-based approach is more flexible and is a little more readable, so I prefer it.


Note that when elements of list1 are equal, this approach will end up comparing elements of list2. If elements of list2 don't support comparison, or don't produce a boolean when compared (for example, if list2 is a list of NumPy arrays), this will fail, and if elements of list2 are very expensive to compare, it might be better to avoid comparison anyway.

In that case, you can sort indices as suggested in jfs's answer, or you can give the sort a key function that avoids comparing elements of list2:

result1, result2 = zip(*sorted(zip(list1, list2), key=lambda x: x[0]))

Also, the use of zip(*...) as a transpose fails when the input is empty. If your inputs might be empty, you will have to handle that case separately.

Solution 2 - Python

You can sort indexes using values as keys:

indexes = range(len(list1))
indexes.sort(key=list1.__getitem__)

To get sorted lists given sorted indexes:

sorted_list1 = map(list1.__getitem__, indexes)
sorted_list2 = map(list2.__getitem__, indexes)

In your case you shouldn't have list1, list2 but rather a single list of pairs:

data = [(3, 'three'), (2, 'two'), (4, 'four'), (1, 'one'), (1, 'one2')]

It is easy to create; it is easy to sort in Python:

data.sort() # sort using a pair as a key

Sort by the first value only:

data.sort(key=lambda pair: pair[0])

Solution 3 - Python

I have used the answer given by senderle for a long time until I discovered np.argsort. Here is how it works.

# idx works on np.array and not lists.
list1 = np.array([3,2,4,1])
list2 = np.array(["three","two","four","one"])
idx   = np.argsort(list1)

list1 = np.array(list1)[idx]
list2 = np.array(list2)[idx]

I find this solution more intuitive, and it works really well. The perfomance:

def sorting(l1, l2):
    # l1 and l2 has to be numpy arrays
    idx = np.argsort(l1)
    return l1[idx], l2[idx]

# list1 and list2 are np.arrays here...
%timeit sorting(list1, list2)
100000 loops, best of 3: 3.53 us per loop

# This works best when the lists are NOT np.array
%timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 2.41 us per loop

# 0.01us better for np.array (I think this is negligible)
%timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best for 3 loops: 1.96 us per loop

Even though np.argsort isn't the fastest one, I find it easier to use.

Solution 4 - Python

Schwartzian transform. The built-in Python sorting is stable, so the two 1s don't cause a problem.

>>> l1 = [3, 2, 4, 1, 1]
>>> l2 = ['three', 'two', 'four', 'one', 'second one']
>>> zip(*sorted(zip(l1, l2)))
[(1, 1, 2, 3, 4), ('one', 'second one', 'two', 'three', 'four')]

Solution 5 - Python

One way is to track where each index goes to by sorting the identity [0,1,2,..n]

This works for any number of lists.

Then move each item to its position. Using splices is best.

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

index = list(range(len(list1)))
print(index)
'[0, 1, 2, 3, 4]'

index.sort(key = list1.__getitem__)
print(index)
'[3, 4, 1, 0, 2]'

list1[:] = [list1[i] for i in index]
list2[:] = [list2[i] for i in index]

print(list1)
print(list2)
'[1, 1, 2, 3, 4]'
"['one', 'one2', 'two', 'three', 'four']"

Note we could have iterated the lists without even sorting them:

list1_iter = (list1[i] for i in index)

Solution 6 - Python

If you are using numpy you can use np.argsort to get the sorted indices and apply those indices to the list. This works for any number of list that you would want to sort.

import numpy as np

arr1 = np.array([4,3,1,32,21])
arr2 = arr1 * 10
sorted_idxs = np.argsort(arr1)

print(sorted_idxs)
>>> array([2, 1, 0, 4, 3])

print(arr1[sorted_idxs])
>>> array([ 1,  3,  4, 21, 32])

print(arr2[sorted_idxs])
>>> array([ 10,  30,  40, 210, 320])

Solution 7 - Python

You can use the zip() and sort() functions to accomplish this:

Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
>>> list1 = [3,2,4,1,1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> zipped = zip(list1, list2)
>>> zipped.sort()
>>> slist1 = [i for (i, s) in zipped]
>>> slist1
[1, 1, 2, 3, 4]
>>> slist2 = [s for (i, s) in zipped]
>>> slist2
['one', 'one2', 'two', 'three', 'four']

Hope this helps

Solution 8 - Python

What about:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

sortedRes = sorted(zip(list1, list2), key=lambda x: x[0]) # use 0 or 1 depending on what you want to sort
>>> [(1, 'one'), (1, 'one2'), (2, 'two'), (3, 'three'), (4, 'four')]

Solution 9 - Python

You can use the key argument in sorted() method unless you have two same values in list2.

The code is given below:

sorted(list2, key = lambda x: list1[list2.index(x)]) 

It sorts list2 according to corresponding values in list1, but make sure that while using this, no two values in list2 evaluate to be equal because list.index() function give the first value

Solution 10 - Python

Another approach to retaining the order of a string list when sorting against another list is as follows:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

# sort on list1 while retaining order of string list
sorted_list1 = [y for _,y in sorted(zip(list1,list2),key=lambda x: x[0])]
sorted_list2 = sorted(list1)

print(sorted_list1)
print(sorted_list2)

output

['one', 'one2', 'two', 'three', 'four']
[1, 1, 2, 3, 4]

Solution 11 - Python

I would like to expand open jfs's answer, which worked great for my problem: sorting two lists by a third, decorated list:

We can create our decorated list in any way, but in this case we will create it from the elements of one of the two original lists, that we want to sort:

# say we have the following list and we want to sort both by the algorithms name 
# (if we were to sort by the string_list, it would sort by the numerical 
# value in the strings)
string_list = ["0.123 Algo. XYZ", "0.345 Algo. BCD", "0.987 Algo. ABC"]
dict_list = [{"dict_xyz": "XYZ"}, {"dict_bcd": "BCD"}, {"dict_abc": "ABC"}]

# thus we need to create the decorator list, which we can now use to sort
decorated = [text[6:] for text in string_list]  
# decorated list to sort
>>> decorated
['Algo. XYZ', 'Algo. BCD', 'Algo. ABC']

Now we can apply jfs's solution to sort our two lists by the third

# create and sort the list of indices
sorted_indices = list(range(len(string_list)))
sorted_indices.sort(key=decorated.__getitem__)

# map sorted indices to the two, original lists
sorted_stringList = list(map(string_list.__getitem__, sorted_indices))
sorted_dictList = list(map(dict_list.__getitem__, sorted_indices))

# output
>>> sorted_stringList
['0.987 Algo. ABC', '0.345 Algo. BCD', '0.123 Algo. XYZ']
>>> sorted_dictList
[{'dict_abc': 'ABC'}, {'dict_bcd': 'BCD'}, {'dict_xyz': 'XYZ'}]

Solution 12 - Python

I would like to suggest a solution if you need to sort more than 2 lists in sync:

def SortAndSyncList_Multi(ListToSort, *ListsToSync):
	y = sorted(zip(ListToSort, zip(*ListsToSync)))
    w = [n for n in zip(*y)]
	return list(w[0]), tuple(list(a) for a in zip(*w[1]))

Solution 13 - Python

newsource=[];newtarget=[]
for valueT in targetFiles:
    for valueS in sourceFiles:
            l1=len(valueS);l2=len(valueT);
            j=0
            while (j< l1):
                    if (str(valueT) == valueS[j:l1]) :
                            newsource.append(valueS)
                            newtarget.append(valueT)
                    j+=1

Solution 14 - Python

an algorithmic solution:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']


lis = [(list1[i], list2[i]) for i in range(len(list1))]
list1.sort()
list2 = [x[1] for i in range(len(list1)) for x in lis if x[0] == i]

Outputs: -> Output speed: 0.2s

>>>list1
>>>[1, 1, 2, 3, 4]
>>>list2
>>>['one', 'one2', 'two', 'three', 'four']

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLostsoulView Question on Stackoverflow
Solution 1 - PythonsenderleView Answer on Stackoverflow
Solution 2 - PythonjfsView Answer on Stackoverflow
Solution 3 - PythonDaniel Thaagaard AndreasenView Answer on Stackoverflow
Solution 4 - PythonKarl KnechtelView Answer on Stackoverflow
Solution 5 - Pythonrobert kingView Answer on Stackoverflow
Solution 6 - PythonKurtis StreutkerView Answer on Stackoverflow
Solution 7 - PythonHunter McMillenView Answer on Stackoverflow
Solution 8 - PythonArtsiom RudzenkaView Answer on Stackoverflow
Solution 9 - PythonSaurav YadavView Answer on Stackoverflow
Solution 10 - PythonbrockView Answer on Stackoverflow
Solution 11 - Pythonfrietz58View Answer on Stackoverflow
Solution 12 - PythonprokofievloverView Answer on Stackoverflow
Solution 13 - Pythonuser10340258View Answer on Stackoverflow
Solution 14 - PythonJundullahView Answer on Stackoverflow