loading...

Removing duplicates from a string in one planet-size JavaScript statement

sergeis profile image Sergei ・2 min read

JavaScript array functions are one feature borrowed from functional programming that are relatively easy to wrap your head around. I wrote before about doing FizzBuzz in one planet-size JavaScript statement, and this post is about solving another basic coding interview question in one statement: removing duplicates from a string.

A standard approach to removing duplicates is to create a set of seen characters as we go through the string and only retain those that have not been seen. But that's not the only way to do it. Note that we can easily sort an array, so all the duplicates are bunched together, and then reduce repeated sequences to the first element only. Note the word reduce, because that's the array function we will use to do it!

It may seem expensive and wasteful to use sort() to accomplish what we need, but remember that comparison sorting is O(n log(n)), not far off O(n) that is the lowest order possible for string operations.

So, here is what we intend to do:

  1. sort the given string so that the duplicates are all together
  2. replace all substrings of repeated characters with the character that is repeated
  3. restore the original order.

We can do all of those in a single JavaScript statement, though it gets a bit long. Here is an annotated example that you can copy and paste into the debug console:

'4366447654434567876'.split('')             // string to array
    .map((e,i)=>({val:e,pos:i}))            // remember the original position of each character
    .sort((a,b)=>a.val.localeCompare(b.val))// sort by value
    .reduce((acc,e)=>acc.length == 0 
        || acc[acc.length-1].val!=e.val?    // keep if not a duplicate 
        acc.concat([e]):acc, 
        [])                                 // empty array as a seed        
    .sort((a,b)=>a.pos-b.pos)               // restore the original order
    .map(e=>e.val)                          // select original values
    .join('')                               // back to string!
;

And the answer is, as expected

"436758"

Most of the above should be self-explanatory, but it is worth explaining what we did in the reduce() function:

  1. start with an empty accumulator array []
  2. insert the first element unconditionally (hence the check for acc.length == 0)
  3. for each subsequent element, if it is not a duplicate, add it to the accumulator array
  4. otherwise leave the accumulator unchanged.

Now, this is definitely more wasteful than using an intermediate set of seen characters, so don't offer it during a coding interview as your first choice! But it is neat to know that we can do it all in one (veeeery long) line. Or, if you are an interviewer who is bored out of their mind with asking the same questions over and over, you might want to mix it up by challenging the poor interviewee to answer a question like that in one line.

Another note. It is not very hard to do the "set of seen" algorithm as a one-liner as well, can you do it?

Posted on by:

Discussion

pic
Editor guide