Today I learned something strange about the sort
function in JavaScript.
This evening I was going through a course on algorithms in JavaScript. The instructor gave a task to create a unique sort function that took an array, removed duplicate values and sorted it in ascending order.
They gave some starting code to work with:
const uniqSort = arr => {
const breadcrumbs = {}
return arr.sort((a, b) => a - b)
}
Before tackling the task at hand I said to myself: aha! I don't need to have that function within sort
as the sort
function already sorts in ascending order!
But then I thought to myself: wait, the instructor put that there for a reason. Let's find out why.
So I went digging. Here is the documentation on the sort
function.
If compareFunction is not supplied, all non-undefined array elements are sorted by converting them to strings and comparing strings in UTF-16 code units order.
...
In a numeric sort, 9 comes before 80, but because numbers are converted to strings, "80" comes before "9" in the Unicode order.
Pardon?
So, if I do the following:
[1, 80, 9].sort()
I get
[1, 80, 9]
Because my array values are converted to strings before sorting. And as we saw before, "'80' comes before '9' in the Unicode order"
I'm told this is a common "gotcha!" and I didn't encounter this issue in the 4+ years I've been a developer. I count myself lucky as I bet that would have been annoying to debug.
Make sure you always do your sorts with a function!
[1, 80, 9].sort((a, b) => a - b)
I'm so curious about why this decision was made when the sort
function was originally written. If anyone knows please comment. What other weird things in JavaScript do you know about that I might not?
Top comments (4)
This is a great heads-up! I had run into this problem, but honestly never understood why before. Now I'll remember. :)
I don't know for sure why it works this way, but my guess is that, because the function takes an array that could potentially contain any mix of values and types, they all need to be converted to something common in order to be compared. And since String is the only type you can (somewhat) reliably convert other types into, that was the choice.
Presumably, it was decided that the function shouldn't do any other conversion (like adding zeroes before numbers), in order to keep the function simple, and preserve the original data with as little munging as possible (and probably to avoid other, similar issues that would've arisen with trying to predict input, like how many zeroes to add, whether they would be added before or after the type conversion, and whether incoming string values containing numbers would be affected).
This is an excellent suggestion, thanks!
Hey James. Thanks for pointing that out about sorting. I havenβt run into that before either. Now Iβll always remember to use a function.
Glad it helped someone else!