Why is one string greater than the other when comparing strings in JavaScript?

JavascriptStringCompare

Javascript Problem Overview


I see this code from a book:

var a = "one";
var b = "four";
a>b; // will return true

but it doesn't mention why "one" is bigger than "four". I tried c = "a" and it is smaller than a and b. I want to know how JavaScript compares these strings.

Javascript Solutions


Solution 1 - Javascript

Because, as in many programming languages, strings are compared lexicographically.

You can think of this as a fancier version of alphabetical ordering, the difference being that alphabetic ordering only covers the 26 characters a through z.


This answer is in response to a [tag:Java] question, but the logic is exactly the same. Another good one: https://stackoverflow.com/questions/1863028/string-compare-logic.

Solution 2 - Javascript

"one" starts with 'o', "four" starts with 'f', 'o' is later in the alphabet than 'f' so "one" is greater than "four". See this page for some nice examples of JavaScript string comparisons (with explanations!).

Solution 3 - Javascript

Javascript uses Lexicographical order for the > operator. 'f' proceeds 'o' so the comparison "one" > "four" returns true

Solution 4 - Javascript

In the 11th edition of the ECMAScript Language Specification the "Abstract Relational Comparison" clause defines how to compute x < y. When the expression is reverted (i.e. x > y) we should compute the result of y < x instead.

So to solve "one" > "four" we must solve "four" < "one" instead.

The same clause says this:

> The comparison of Strings uses a simple lexicographic ordering on sequences of code unit values.

And this if both operands are strings:

> 3. If Type(px) is String and Type(py) is String, then > - If IsStringPrefix(py, px) is true, return false. > - If IsStringPrefix(px, py) is true, return true. > - Let k be the smallest nonnegative integer such that the code unit at index k within px is different from the code unit at index k within py. (There must be such a k, for neither String is a prefix of the other.) > - Let m be the integer that is the numeric value of the code unit at index k within px. > - Let n be the integer that is the numeric value of the code unit at index k within py. > - If m < n, return true. Otherwise, return false.

(We can safely ignore the first two points for this example)

So let's see the code units for "four":

[..."four"].map(c => c.charCodeAt(0));
//=> [102, 111, 117, 114]

And for "one":

[..."one"].map(c => c.charCodeAt(0));
//=> [111, 110, 101]

So now we must find a value for k (starting at 0) where both m[k] and n[k] are different:

|   | 0   | 1   | 2   | 3   |
|---|-----|-----|-----|-----|
| m | 102 | 111 | 117 | 114 |
| n | 111 | 110 | 101 |     |

We can see that at 0 both m[0] and n[0] are different.

Since m[0] < n[0] is true then "four" < "one" is true and thus "one" > "four" is true.


What does "☂︎" < "☀︎" return?

[..."☂︎"].map(c => c.charCodeAt(0))
//=> [9730, 65038]
[..."☀︎"].map(c => c.charCodeAt(0))
//=> [9728, 65038]
|   | 0    | 1     |
|---|------|-------|
| m | 9730 | 65038 |
| n | 9728 | 65038 |

Since 9730 < 9728 is false then "☂︎" < "☀︎" is false which is nice because rain is not better than sun (obviously ;).

Solution 5 - Javascript

When you use a relational operator like <= with strings in JavaScript, you're comparing their underlying Unicode code units,¹ one at a time from the beginning, stopping the first time you find any difference. "one" > "four" is true because "o" (code unit 111) is greater than to "f" (code unit 102). Since a difference is found in the first character, the rest of the characters are ignored. If you had "fb" > "fa", the two "f"s would be compared, found to be the same, and then the next letter of each string ("b" and "a") would be compared. If the strings are different lengths and the longer one starts with the shorter one, the shorter one is "less than" the longer one ("aaa" < "aaab" is true).

This used to be covered by the Abstract Relational Comparison operation in the specification, but now it's the IsLessThan operation.


¹ The fact that the relational operators use code units is one good reason not to use them with strings, since the code unit order in many cases doesn't map well to people's expectations based in their language ("é" < "z" is false, which probably makes little sense to French speakers); instead, use localeCompare, perhaps with some optional settings to compare appropriately for the language the strings contain ("é".localeCompare("z", "fr") < 0 is true, because é comes before z in a proper lexicographical order in the "fr" locale).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionpatriot7View Question on Stackoverflow
Solution 1 - JavascriptMatt BallView Answer on Stackoverflow
Solution 2 - JavascriptPaulView Answer on Stackoverflow
Solution 3 - JavascriptmartinView Answer on Stackoverflow
Solution 4 - JavascriptcustomcommanderView Answer on Stackoverflow
Solution 5 - JavascriptT.J. CrowderView Answer on Stackoverflow