Sebastian Nozzi

Posted on

Counting word lengths (map approach)

In a previous post we saw already how to solve the problem by using an array.

Specifically we had a 12-length array, and each position was used to increase a counter according to the current word's length. We had to convert from word-length to array-index by subtracting 1 from the first number.

This solution worked perfectly fine and given the constraint of only considering word-lengths up to 12, I would go this route.

But it is worth looking at a map-based approach (in other languages "map" is known as "dictionary"). Some other problems are better solved with this approach.

Recap

The complete array-based solution looked like this:

``````import "listUtil"

word_length_counters = [0] * 12

for word in words
word_length = len(word)
if word_length > 12 then continue
counter_index = word_length - 1
current_count = word_length_counters[counter_index]
word_length_counters[counter_index] = current_count + 1
end for

max_count = word_length_counters.max
max_count_index = word_length_counters.indexOf(max_count)
max_word_length = max_count_index + 1

print "The most common word length is " + max_word_length
``````

Indeed, most of the code can remain the same. But some key changes need to be made.

First changes

First, instead of storing the counters in an array we will use a map. It will be a generic empty map, like this:

``````word_length_counters = {}
``````

One advantage of using maps is that we can use "human" numbers as keys. That is: we can use directly the word-lenghts (which go from 1 to 12) without having to subtract 1 as we did for array indexes.

In particular MiniScript allows you to use pretty much anything as map-indexes, which is very convenient. It even allows you to use numbers as keys - which we will be doing here.

The first part of the for-loop remains the same:

``````for word in words
word_length = len(word)
if word_length > 12 then continue
``````

But then it gets interesting. We don't need this:

``````counter_index = word_length - 1
``````

We can access the `current_count` by `word_length` as mentioned:

``````current_count = word_length_counters[word_length]
``````

And store it back, also using `word_length` as the key of the map:

``````word_length_counters[counter_index] = current_count + 1
``````

Problem found

Putting it all together, this is how the for-loop would look like:

``````for word in words
word_length = len(word)
if word_length > 12 then continue
current_count = word_length_counters[word_length]
word_length_counters[word_length] = current_count + 1
end for
``````

Looks ok, right? But what happens when you run it?

``````Runtime Error: Key Not Found: '1' not found in map [line 11]
``````

Why is that? Problem is, the map is initially empty! So, the first time the "current_count" wants to be retrieved, nothing is found. We need to handle the case where the map does not yet have a counter for a given word-length.

We need to ask: does this map have the key "word_length"? If not, add an entry for it with a 0. This will have the effect that retrieving the "current_count" in the next line will find something (a 0 value) to add 1 to.

First-time access

How do we ask if a map has a certain "key"? In MiniScript this is solved at the "collection" level. That is: there is a method that is called the same and acts similarly for strings, lists and maps. That method is `hasIndex`.

With that in mind, this is the line we need:

``````if not word_length_counters.hasIndex(word_length) then word_length_counters[word_length] = 0
``````

Putting that before the first access gives us then this (fixed) for-loop:

``````for word in words
word_length = len(word)
if word_length > 12 then continue
if not word_length_counters.hasIndex(word_length) then word_length_counters[word_length] = 0
current_count = word_length_counters[word_length]
word_length_counters[word_length] = current_count + 1
end for
``````

The result-calculation

Now is the time to adapt the result-calculation.

In the array-based approach `word_length_counters` already contained the word-lengths. But because we are using a map (which maps from word-length to word-counts) we need to look at the values of our map.

The first line would thus change from:

``````max_count = word_length_counters.max
``````

to:

``````max_count = word_length_counters.values.max
``````

Because the keys of the map are already word-lengths, we don't need to convert from "index" to "word-length". Even if MiniScript is talking about "index", it really means "key" in the context of a map.

Therefor the next lines can be simplified from:

``````max_count_index = word_length_counters.indexOf(max_count)
max_word_length = max_count_index + 1
``````

to:

``````max_word_length = word_length_counters.indexOf(max_count)
``````

The final `print` remains the same.

Complete solution

Finally, the complete map-based solution looks like this:

``````import "listUtil"

word_length_counters = {}

for word in words
word_length = len(word)
if word_length > 12 then continue
if not word_length_counters.hasIndex(word_length) then word_length_counters[word_length] = 0
current_count = word_length_counters[word_length]
word_length_counters[word_length] = current_count + 1
end for

max_count = word_length_counters.values.max
max_word_length = word_length_counters.indexOf(max_count)

print "The most common word length is " + max_word_length
``````

Hopefully it was clear how to use maps in general, and in particular in this case (interestingly, using numbers as keys).

JoeStrout

Nice post! I just wanted to point out another approach to the initial-value problem. Actually, two other approaches. :) One is to initialize your map to zeros right away:

``````word_length_counters = {}
for i in range(1,12); word_length_counters[i] = 0; end for
``````

The other is to `import "mapUtil"` and then use the `map.get` method that module adds. `get` takes an index, and a default value to use if that index is not found. So then you could safely get the current_count as

``````current_count = word_length_counters.get(word_length, 0)
``````

Sebastian Nozzi

Thanks a lot for the pointers! I like both approaches - whatever it takes to make code simpler. I particularly like the `get` method from "mapUtil". Had I known it existed I would have used this in my solution. I use this on Python where applicable and I'm glad to see that it's included in Mini Micro.

DEV Community

50 CLI Tools You Can't Live Without

>> Check out this classic DEV post <<