400x Sorting Speedup by Switching a.localeCompare(b) to (a<b?-1:(a>b?1:0))

JavascriptGoogle ChromeSorting

Javascript Problem Overview


By switching a javascript sort function from

myArray.sort(function (a, b) {
  return a.name.localeCompare(b.name);
});

to

myArray.sort(function (a, b) {
  return (a.name < b.name ? -1 : (a.name > b.name ? 1 : 0));
});

I was able to cut the time to sort a ~1700 element array in Chrome from 1993 milliseconds to 5 milliseconds. Almost a 400x speedup. Unfortunately this is at the expense of correctly sorting non-english strings.

Obviously I can't have my UI blocking for 2 seconds when I try to do a sort. Is there anything I can do to avoid the horribly slow localeCompare but still maintain support for localized strings?

Javascript Solutions


Solution 1 - Javascript

A great performance improvement can be obtained by declaring the collator object beforehand and using it's compare method. EG:

const collator = new Intl.Collator('en', { numeric: true, sensitivity: 'base' });
arrayOfObjects.sort((a, b) => {
  return collator.compare(a.name, b.name);
});

NOTE: This doesn't work ok if the elements are floats. See explanation here: https://stackoverflow.com/questions/40107588/intl-collator-and-natural-sort-with-numeric-option-sorts-incorrectly-with-decima

Here's a benchmark script comparing the 3 methods:

const arr = [];
for (let i = 0; i < 2000; i++) {
  arr.push(`test-${Math.random()}`);
}

const arr1 = arr.slice();
const arr2 = arr.slice();
const arr3 = arr.slice();

console.time('#1 - localeCompare');
arr1.sort((a, b) => a.localeCompare(
  b,
  undefined, {
    numeric: true,
    sensitivity: 'base'
  }
));
console.timeEnd('#1 - localeCompare');

console.time('#2 - collator');
const collator = new Intl.Collator('en', {
  numeric: true,
  sensitivity: 'base'
});
arr2.sort((a, b) => collator.compare(a, b));
console.timeEnd('#2 - collator');

console.time('#3 - non-locale');
arr3.sort((a, b) => (a < b ? -1 : (a > b ? 1 : 0)));
console.timeEnd('#3 - non-locale');

Solution 2 - Javascript

An effective Approach that I found when dealing with /mostly/ latin characters is to use the operator whenever both strings match a specific regex. EG: /^[\w-.\s,]*$/

It's much faster if both strings match the expression, and at worst it seems to be slightly slower than blindly calling localeCompare.

Example here: http://jsperf.com/operator-vs-localecompage/11

Update: it seems like Intl.Collator is currently the best option for performance across the board: https://jsperf.com/operator-vs-localecompage/22

Solution 3 - Javascript

It's difficult to know the fastest sort without seeing the data you are sorting. But jsperf has lots of good tests showing the performance differences between types of sorting: http://jsperf.com/javascript-sort/45 http://jsperf.com/sort-algorithms/31

However none of these account for localised strings, and i'd imagine there is no easy way to sort localised strings and localeCompare is probably the best solution for this.

Looking at mozilla reference is says: "When comparing large numbers of strings, such as in sorting large arrays, it is better to create an Intl.Collator object and use the function provided by its compare property." https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare

But going to the Intl.Collator reference it shows that is not support for firefox/safari https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Collator

you could try using some of the options on localCompare to speed up the performance. But I've just done a quick test changing the sensitivity level and it seems like it won't improve the performance:

list.sort(function(a, b) {
  return a.localeCompare(b, {sensitivity:'base'});
});

http://jsperf.com/sort-locale-strings

Solution 4 - Javascript

Try sorting it in 2 steps:

  1. With the operator: as you said, it will be 400 times faster
  2. Then with localCompare(): this has now less comparisons to do because the array is mostly sorted.

Note: I think that localCompare() will mostly be called with at least 1 string that is not english. So the number of calls to localCompare() with 2 english strings should be greatly reduced.

Here is the code:

myArray.sort(function(a, b) {
  return (a.name < b.name ? -1 : (a.name > b.name ? 1 : 0));
});

myArray.sort(function(a, b) {
  return a.name.localeCompare(b.name);
});

This solution has the advantage of being short and easy to use. It will be efficient if the array contains mostly english strings. The more non-english strings you have, the less useful the first sort will be. But as it is easy to add in your scripts, it is also easy to see if this approach is worthwile.

Now if I were you, I would also use an Intl.Collator, as it is said to be much faster than localCompare() when you have many comparisons to do.

Solution 5 - Javascript

I don't know you still looking for solution to this problem

// Defaulted to ascending
// 1 asc | -1 desc
var direction = 1; 
myArray.sort(function (a, b) {
  return a.name.localeCompare(b.name) === 1 ? direction : -1 * direction;
});

i added a === 1 check to your code and this improved perf 400x that means both have comparable perf numbers.

Perf numbers with localeCompare arr size: 3200 Avg time took in 10 repetitions : 60 ms

Perf numbers with > approach. avg time took 55 ms

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBrad DwyerView Question on Stackoverflow
Solution 1 - JavascriptAndyView Answer on Stackoverflow
Solution 2 - JavascriptJamie PateView Answer on Stackoverflow
Solution 3 - JavascriptKim TView Answer on Stackoverflow
Solution 4 - JavascriptjlgrallView Answer on Stackoverflow
Solution 5 - JavascriptJosephView Answer on Stackoverflow