In this challenge, you'll mimic Google's "Did you mean...?" feature. When using Google's search engine, the user will receive spelling corrections and suggestions to aid their search. From a given (or self-made) dictionary, write a function that will accept a string and return the most similar string from the dictionary.
To complete this challenge, you'll have an entered term (lowercase string) and an array of known words (also lowercase strings). Your task is to find out which word from the dictionary is most similar to the input term. Their similarity is defined by the minimum number of letters that must be added, replaced, or removed to get from the entered word to one of the dictionary words. The lower the number of required changes is, the more similar the words are.
Words that are the same are obviously the most similar. A word that needs one letter to be changed is more similar to another word that needs 2 (or more) letters to be changed.
Use the arrays below, combine them, or make your own dictionaries.
Array 1 + Example
['cherry', 'pineapple', 'melon', 'strawberry', 'raspberry']
fruits.findMostSimilar('strawbery');
// must return "strawberry"
fruits.findMostSimilar('berry');
// must return "cherry"
Array 2
['stars', 'mars', 'wars', 'codec', 'codewars']
Array 3
['javascript', 'java', 'ruby', 'php', 'python', 'coffeescript']
Good luck!
This challenge comes from user BattleRattle. Thank you to CodeWars, who has licensed redistribution of this challenge under the 2-Clause BSD License!
Want to propose a challenge for a future post? Email yo+challenge@dev.to with your suggestions!
Top comments (9)
Implemented the algorithm in the Wikipedia article for Levenshtein distance in JavaScript:
EDIT
Forgot to add testing code, so here's some test cases. (The test runner assumes that it's being run in the web console, by the way.)
And here's the output when I run it in Chrome:
Here it goes a python one :
I compare by the distances of each char and give less importance to length
plzz explain ur code to me ...
It's not working 100% but for most basic cases it does, first of all I check if the word passed is inside the fruits list, if not i iterate through each character of the shortest word and I subtract with abs (so that the result is always positive) the value of the characters with the Ord function, then I add the difference of word lengths.
It is kind of a own metric that is not very thinked but it worked to me for these simple tests.
Finally I select as a result the word containing the less value of the metrics I calculated.
After completing the challenge this way I read about levenstein function, it may be a better way to do this.
Sorry I typed on my phone.
Will optimise The levenstein function asap. Right now Tail call optimization doesn’t apply.
Haskell:
This one was interesting for me: I'm pretty new to haskell and this is my first time playing around with memoization (without memoization, the
lev
function is absurdly slow).The lev function is a recursive implementation of the Levenshtein distance from the Wikipedia article
Elixir:
And yes,
String.myers_difference/2
is built-in :)There's no validation of the input to decode. Also, I'm not sure if the second test provided for decode is correct. I'm not getting any English words out of it, but my decode function hasn't failed besides that test.
The lev function is a recursive implementation of the Levenshtein distance from the Wikipedia article happy wheels