<<< Introduction | <<< Setting Up | <<< Telling Stories
Words, words, words
In this article I'll present code for an anagram finder component. As before, my target audience is people with levels of technical skill ranging from near-beginner to intermediate.
In the previous article I showed how the user interface of the app can be built in such a way as to make maintenance easy, by using the EasyCoder scripting language. The actual job of finding anagrams, however, is too much for scripts of that kind to handle; instead we need some JavaScript.
Here's the complete source code. I doubt if it's the most efficient algorithm ever written but it's quite easy to follow. I've mostly kept to traditional JavaScript instead of the more recent flavors.
const AnagramFinder = {
alphabet: [
`a`, `b`, `c`, `d`, `e`, `f`, `g`, `h`, `i`, `j`, `k`, `l`, `m`,
`n`, `o`, `p`, `q`, `r`, `s`, `t`, `u`, `v`, `w`, `x`, `y`, `z`
],
// Get anagrams from the given text
getAnagrams: function (text, list) {
const words = [];
let remaining = text;
let found;
while (remaining) {
found = false;
const reduced = AnagramFinder.reduce(list, remaining.length);
AnagramFinder.shuffle(reduced);
const mtext = AnagramFinder.measure(remaining);
for (let n = 0; n < reduced.length; n++) {
const letters = AnagramFinder.measure(reduced[n]);
if (AnagramFinder.contains(mtext, letters)) {
words.push(reduced[n]);
remaining = AnagramFinder.remove(mtext, letters);
found = true;
break;
}
}
if (!found) break;
}
return {
status: found ? `found` : ``,
words
};
},
// Remove a found word from the remaining text
remove: function(remaining, letters) {
let result = '';
for (let n = 0; n < 26; n++) {
const letter = AnagramFinder.alphabet[n];
remaining[letter] -= letters[letter];
for (let m = 0; m < remaining[letter]; m++) {
result += AnagramFinder.alphabet[n];
}
}
return result;
},
// Copy a list, removing words longer than a given length
reduce: function(list, len) {
const result = [];
for (let n = 0; n < list.length; n++) {
if (list[n].length <= len) {
result.push(list[n]);
}
}
return result;
},
// Shuffle a list
shuffle: function(list) {
for (let i = list.length - 1; i > 0; i--) {
const j = Math.floor(Math.random() * (i + 1));
[list[i], list[j]] = [list[j], list[i]];
}
},
// Check if the first string contains the second, by comparing character maps
contains: function(a, b) {
for (let n = 0; n < 26; n++) {
if (a[AnagramFinder.alphabet[n]] < b[AnagramFinder.alphabet[n]]) {
return false;
}
}
return true;
},
// Measure a word, returning a map of letter counts
measure: function (t) {
const map = {};
for (let n = 0; n < 26; n++) {
map[AnagramFinder.alphabet[n]] = 0;
}
const text = t.toLowerCase();
for (let n = 0; n < text.length; n++) {
const c = text.charAt(n);
if (c.toLowerCase() != c.toUpperCase()) {
map[c]++;
}
}
return map;
}
};
The component is constructed as a single object, ideally with a name that won't clash with anything else you might add to your web page. Inside are all the functions it uses.
It starts by defining an array comprising the letters of the alphabet. Then comes the main program, getAnagrams()
. This is given the text to process and the dictionary to use, which permits the library to be used for languages other than English if a suitable word list is available.
The function looks down the dictionary for any word it can make using the letters available. When it finds one it adds it to the output list, removes the letters it used from the original text and tries again. Eventually it will either have taken all the letters available or failed to find a word. Either way, it sends back a small object with the words it found and a success/fail flag.
The dictionary is randomised before starting a run, otherwise the same words would come back each time. Words are compared by converting them into sets of 26 elements that hold the number of times each letter of the alphabet occurs in the word.
The remove()
function removes a word from the text.
The reduce()
function copies the original list, removing all words longer than the input text.
The shuffle()
function randomises the list of words.
The contains()
function tests if one string of letters contains all the letters of another string. This is not the same as the JavaScript function indexOf()
, which requires the letters to be in the right order. Here we only care if they are present.
The measure()
function converts a word into a map of letter counts. It uses a neat but far from obvious trick to tell if a character is alphabetic, by testing if it changes when converted from lower to upper case. Only alphabetics do that.
Who does what
The anagram finder will take a variable length of time to search for a set of words, and the result will be success or failure. The next action will be to try again in the hope that shuffling the dictionary will produce a result. With over 30,000 words in a dictionary this may have to be done millions of times before all the possible combinations have been winkled out, so the question is how to organise these retries. It is of course possible to add a retry mechanism to the finder itself, but this has problems:
- JavaScript has only a single thread, so while the finder is running everything else will be blocked. A single anagram search, using the function above, takes only a fraction of a second, even on a smartphone, so the block won't be noticed, but running it thousands of times is another matter. You could get the finder to pause once in a while and allow another interface to receive a stop message, but this makes interfacing to the component more complex. Other than that there's no way to stop the program other than by killing the browser process, which is pretty drastic.
- Modern computers are sprinters, not marathon runners. They can go incredibly quickly but only for a short time, after which they rapidly start to overheat. If you ever code a loop and forget to increment the loop counter, after a few seconds the computer's fan starts up, and if you have a temperature monitor you'll see it shooting up to dangerous levels. Again, the only way to stop it is by killing its process, except you can't get it to listen to you because the entire computer is locked up and about to melt down. All you can do is hit the reset button.
If you want to run the anagram finder a thousand or more times you'll need to put in a pause between runs to let the computer catch its breath and cool off a bit. Pauses in JavaScript make the code harder to follow as they have to be done with callbacks, and you still have to consider how to tell the finder to stop running.
These factors have very little to do with actually finding anagrams, so to keep things clean it's best to let the web app itself handle the retries, the pauses and the user interactions. Then the anagram finder can just do its job without the need to worry about all that other stuff.
The library API
From the web app's point of view there's only one useful function exposed by this module; getAnagrams()
. Other functions are available but are unlikely to be needed outside the finder itself. The fewer the interfaces, the better in general, as it avoids systems coming to look like a bowl of spaghetti and makes it simpler to integrate the library into a web app.
Some web components - such as Google Maps - are visual in nature, while others, such as the anagram finder, just do data processing of some kind. The important consideration is whether they have a clear set of APIs that allows them to be fully separated from the environment they will be used in. That way they can be easily plugged in and unplugged or replaced by other components such as test harnesses. When you build a web page, look for parts of it that are really self-contained and build them as separate components. This reduces the size of the code you must manage and in many cases improves reliability because the components are suitable for use in several places, which increases the amount of testing they undergo.
Coming up...
The final part of this series is an appendix describing the plugin mechanism of EasyCoder that enables support for extra features such as the anagram finder to be added seamlessly to the language. It's unlikely to be of much interest to anyone except those who use EasyCoder and have external functionality they want to handle in the language.
For everyone else, may I thank you for reading this series and wish you success and enjoyment in building your own web apps, however you choose to do it.
Title photo by Clark Tibbs on Unsplash
Top comments (0)