I LOVE fun solutions to interview problems. When prepping for interviews, I believe it's important to understand the capabilities and data structures in any given language as they can help you solve menial problems more efficiently.
An interesting interview problem I once had was, "Given an array of n numbers, how would you find if there are any duplicates?"
When faced with this problem as a junior JavaScript developer, I thought the solution would be simple. Just sort the array and then loop through it, while comparing the current index with the previous index. If they match, a duplicate is found!
const duplicateCheck = (numbers) => {
// Sort the numbers
numbers = numbers.sort();
// Loop through the numbers
for (let i = 0; i < numbers.length; i++) {
if (i > 0) {
// Compare the current index with the previous
if (numbers[i] === numbers[i-1]) {
// If they match we found a duplicate, we can stop here
return true;
}
}
}
return false;
};
Sure this works, and your interviewer seems happy, but then they ask, "Can you make it faster?" Then you realize that maybe this isn’t the best solution... While the initial sort is fairly fast, running with a time complexity of Θ(n log(n))
, we also have a loop after it with a time complexity of Θ(n)
. At the end of the day, the function itself runs at Θ(n log(n))
and it may not be the fastest solution.
Okay, let's simplify this to a single loop. We could just loop through the unsorted array and keep track of the values already found. If we end up finding a value we already checked, then we know we have a duplicate and we can stop there.
const duplicateCheck = (numbers) => {
// Store found numbers
const found = {};
// Loop through the numbers
for (let number of numbers) {
// If number has been seen
if (found[number]) {
// End it here, we found a duplicate
return true;
} else {
// If we didn't see it yet, let's log that we've seen it once
found[number] = true;
}
}
return false;
};
This is neater and faster! Its time complexity is now Θ(n)
since we loop through the array, but we skip the sort. This is a faster solution, and you start to feel good about how the interview is going. And then the interviewer asks, "Can you do this with less code?"
After your heart skips a beat and dread sets in, you remember something your friend (me) said: "It's important to understand the capabilities and data structures in any given language." In JavaScript, you have access to the Set
object!
Set objects are collections of values. You can iterate through the elements of a set in insertion order. A value in the Set may only occur once; it is unique in the Set's collection.
So you write the following:
const duplicateCheck = (a) => new Set(a).size !== a.length;
By passing the array into a new Set
, you know that the Set will not allow for any duplicate elements to be added. You now have an iterable without duplicates. The final step is to compare the size of the deduped Set
against the length of the original array. If they're the same, then there are no duplicates. If they're different, you know that duplicates were removed.
You now have a solution that keeps the time complexity of Θ(n)
without the need of a for loop and without needing to keep track of numbers already seen. Instead, you have a neat one-line solution.
I love these one-line solutions! Hopefully, you found this helpful. If you have any interesting or clever solutions to interview questions, I'd love to hear them in the comments. Or if you have a better solution to finding duplicates in an array, I'd love to hear that too.
To keep up with everything I’m doing, follow me on Twitter and dev.to. If you’re thinking, “Show me the code!” you can find me on GitHub.
Top comments (24)
My answer to "Can you make it faster" is:
Do we need to? Has this become a bottle neck performance wise?
If not, don't care, lets spend time building a better product where it actually matters.
While I completely agree with that, I think main purpose of that excercise is to identify that developer know what performance limits of that code are.
I mean, You don't have to worry about performance issues when your devs are delivering optimized code by default, while avoiding premature optimizations :)
Of course, if they answer with "Lets assume that yes it has become/will become a bottle neck", then I would do my best to optimize.
^^^^ This ^^^^
Users don't care how a product is built. They care what the product is and that it solves their problems.
And this extends beyond programming
Hows the marketing?
Hows user retention?
Is it actually solving the target group's problem?
You can have the most efficient and well engineered product but if no one knows about it or it isn't solving their problem, none of that matters
OK, now I'm curious: Does the ECMA standard actually guarantee Θ(n) time complexity for instantiation of a
Set
from an arbitraryArray
?I could easily see all the implementations happening to have Θ(n) time complexity for it, but unless the standard says that it always will be, you can't really say it's Θ(n) without specifying an implementation.
So the first method is
Θ(n log(n))
, Mozilla and Google's implementations are a Merge Sort and Timsort respectively.As for the the ECMA standard for a Set, internally they're recommended to use hash tables (or something similar). Insertion in hash tables run at
Θ(1)
, so an array ofn
elements would reasonably beΘ(n)
.Well yes, but worst case of hash tables is linear. If you don't mind the space you can create an auxiliar array and mark the positions. This is the "same" idea used in the hash table but it guarantees a O(n) solution. This solution requires O(n) extra space complexity, but hash table in worst case is also O(n) ( when all numbers are different)
Good point, the fact that it's built-in doesn't necessarily mean faster.
Great idea for an article. It's a good reminder that interviewers may want to drill down into a question. Just FYI, the original function is
O(n log(n) + n)
which isO(n log(n))
. It isn'tO(n)
. So your second version is actually quite a bit faster. Also, big-theta (Θ
) is different from big-oh.You're right, it's been a while since I've had to do time complexity, I'll update it.
I had to double-check myself that
n log(n)
is higher order thann
:)You can do it with less code, the parenthesis can be omitted here. :P
If you are going to assume
a
is an array, then you could even use!=
.Haha, I thought so too. And I guess we could use
let
instead ofconst
since it's one letter less.Shorter and more useful since it returns the number of duplicates
And now with a meaningful name. Because, what does
duplicateCheck
mean, that is has, or does not have duplicates. For the boolean versionhasDuplicates
would be the better name.I really like one-liners, reminds me of how fun it was working with Perl way back when :-)
Here's one that I came up with a couple of days ago:
How to convert RGB values to a capitalised hex string? Mind you that valid decimal values for RGB are 0 - 255. Any values that fall out of that range must be rounded to the closest valid value.
Golfed, but less checking:
Nice one! Here's a list I made a few months back: dev.to/jpantunes/js-warmup-exercis...
On a second look, your version only shaves off 2 characters and introduces a possible bug imho because you don't control how many parameters are passed to the function, so calling it either less or more than 3 values it will output a wrong value, right?
I love the "FP" (Functional Programming) style enhancements that have been added to Javascript and various other programming languages ... whenever I see a chance to replace an old-fashioned loop with a more concise map-filter-reduce construct I'm not hesitating to do so.
Great stuff, I might ask the interviewer "what is the reason for it?", "May i know In what area for this piece of code will be used?" to buy yourself time to think and help to understand the question better instead of just doing what you are told?
It's always important to try to extract as much info about the problem you're solving for. I agree with that.
Haha, true. Honestly coming up with titles is hard 😅. Buuut that's the solution to the question, which I guess requires you to click and read the article to see...