DEV Community

Cover image for Counting word lengths (map approach)
Sebastian Nozzi
Sebastian Nozzi

Posted on

Counting word lengths (map approach)

In a previous post we saw already how to solve the problem by using an array.

Specifically we had a 12-length array, and each position was used to increase a counter according to the current word's length. We had to convert from word-length to array-index by subtracting 1 from the first number.

This solution worked perfectly fine and given the constraint of only considering word-lengths up to 12, I would go this route.

But it is worth looking at a map-based approach (in other languages "map" is known as "dictionary"). Some other problems are better solved with this approach.

Recap

The complete array-based solution looked like this:

import "listUtil"

words = file.readLines("/sys/data/englishWords.txt")

word_length_counters = [0] * 12

for word in words
    word_length = len(word)
    if word_length > 12 then continue
    counter_index = word_length - 1
    current_count = word_length_counters[counter_index]
    word_length_counters[counter_index] = current_count + 1
end for

max_count = word_length_counters.max
max_count_index = word_length_counters.indexOf(max_count)
max_word_length = max_count_index + 1

print "The most common word length is " + max_word_length
Enter fullscreen mode Exit fullscreen mode

Indeed, most of the code can remain the same. But some key changes need to be made.

First changes

First, instead of storing the counters in an array we will use a map. It will be a generic empty map, like this:

word_length_counters = {}
Enter fullscreen mode Exit fullscreen mode

One advantage of using maps is that we can use "human" numbers as keys. That is: we can use directly the word-lenghts (which go from 1 to 12) without having to subtract 1 as we did for array indexes.

In particular MiniScript allows you to use pretty much anything as map-indexes, which is very convenient. It even allows you to use numbers as keys - which we will be doing here.

The first part of the for-loop remains the same:

for word in words
    word_length = len(word)
    if word_length > 12 then continue
Enter fullscreen mode Exit fullscreen mode

But then it gets interesting. We don't need this:

counter_index = word_length - 1
Enter fullscreen mode Exit fullscreen mode

We can access the current_count by word_length as mentioned:

current_count = word_length_counters[word_length]
Enter fullscreen mode Exit fullscreen mode

And store it back, also using word_length as the key of the map:

word_length_counters[counter_index] = current_count + 1
Enter fullscreen mode Exit fullscreen mode

Problem found

Putting it all together, this is how the for-loop would look like:

for word in words
    word_length = len(word)
    if word_length > 12 then continue
    current_count = word_length_counters[word_length]
    word_length_counters[word_length] = current_count + 1
end for
Enter fullscreen mode Exit fullscreen mode

Looks ok, right? But what happens when you run it?

Runtime Error: Key Not Found: '1' not found in map [line 11]
Enter fullscreen mode Exit fullscreen mode

Why is that? Problem is, the map is initially empty! So, the first time the "current_count" wants to be retrieved, nothing is found. We need to handle the case where the map does not yet have a counter for a given word-length.

We need to ask: does this map have the key "word_length"? If not, add an entry for it with a 0. This will have the effect that retrieving the "current_count" in the next line will find something (a 0 value) to add 1 to.

First-time access

How do we ask if a map has a certain "key"? In MiniScript this is solved at the "collection" level. That is: there is a method that is called the same and acts similarly for strings, lists and maps. That method is hasIndex.

With that in mind, this is the line we need:

if not word_length_counters.hasIndex(word_length) then word_length_counters[word_length] = 0
Enter fullscreen mode Exit fullscreen mode

Putting that before the first access gives us then this (fixed) for-loop:

for word in words
    word_length = len(word)
    if word_length > 12 then continue
    if not word_length_counters.hasIndex(word_length) then word_length_counters[word_length] = 0
    current_count = word_length_counters[word_length]
    word_length_counters[word_length] = current_count + 1
end for
Enter fullscreen mode Exit fullscreen mode

The result-calculation

Now is the time to adapt the result-calculation.

In the array-based approach word_length_counters already contained the word-lengths. But because we are using a map (which maps from word-length to word-counts) we need to look at the values of our map.

The first line would thus change from:

max_count = word_length_counters.max
Enter fullscreen mode Exit fullscreen mode

to:

max_count = word_length_counters.values.max
Enter fullscreen mode Exit fullscreen mode

Because the keys of the map are already word-lengths, we don't need to convert from "index" to "word-length". Even if MiniScript is talking about "index", it really means "key" in the context of a map.

Therefor the next lines can be simplified from:

max_count_index = word_length_counters.indexOf(max_count)
max_word_length = max_count_index + 1
Enter fullscreen mode Exit fullscreen mode

to:

max_word_length = word_length_counters.indexOf(max_count)
Enter fullscreen mode Exit fullscreen mode

The final print remains the same.

Complete solution

Finally, the complete map-based solution looks like this:

import "listUtil"

words = file.readLines("/sys/data/englishWords.txt")

word_length_counters = {}

for word in words
    word_length = len(word)
    if word_length > 12 then continue
    if not word_length_counters.hasIndex(word_length) then word_length_counters[word_length] = 0
    current_count = word_length_counters[word_length]
    word_length_counters[word_length] = current_count + 1
end for

max_count = word_length_counters.values.max
max_word_length = word_length_counters.indexOf(max_count)

print "The most common word length is " + max_word_length
Enter fullscreen mode Exit fullscreen mode

Hopefully it was clear how to use maps in general, and in particular in this case (interestingly, using numbers as keys).

Latest comments (2)

Collapse
 
joestrout profile image
JoeStrout

Nice post! I just wanted to point out another approach to the initial-value problem. Actually, two other approaches. :) One is to initialize your map to zeros right away:

word_length_counters = {}
for i in range(1,12); word_length_counters[i] = 0; end for
Enter fullscreen mode Exit fullscreen mode

The other is to import "mapUtil" and then use the map.get method that module adds. get takes an index, and a default value to use if that index is not found. So then you could safely get the current_count as

current_count = word_length_counters.get(word_length, 0)
Enter fullscreen mode Exit fullscreen mode

without worrying about whether it had a previous value or not.

Collapse
 
sebnozzi profile image
Sebastian Nozzi

Thanks a lot for the pointers! I like both approaches - whatever it takes to make code simpler. I particularly like the get method from "mapUtil". Had I known it existed I would have used this in my solution. I use this on Python where applicable and I'm glad to see that it's included in Mini Micro.