DEV Community

Cover image for Writing a Memoization Function from Scratch
TK
TK

Posted on • Originally published at iamtk.co

Writing a Memoization Function from Scratch

This post was originally published at TK's website.


Today we'll learn how to write a memoization function and speed up the performance of our functions.

Let's start with how we would use it. The API of the memoization function should be like this:

const memoizedFn = memoize(fn);
Enter fullscreen mode Exit fullscreen mode

The function receives a function as an input and returns the memoized function.

In the first function call, it's still "cold", so every function call will be executed and then the output value will be cached for the subsequent calls.

In any subsequent function call, the values are cached, so rather than executing the function body, it'll just return the cached value.

Let's go to the implementation first and then do some performance comparison.

The first thing is that the memoize function should receive a function as an input and return the memoized function. Simple steps:

function memoize(fn) {
  return fn;
}
Enter fullscreen mode Exit fullscreen mode

So now we can just call the memoize function passing a new function and it will return the same input function.

const fn = (a, b) => a * b;
const memoizedFn = memoize(fn);

memoizedFn(1, 1); // 1 (first call: cold)
memoizedFn(1, 1); // 1 (not memoized: it'll execute the function again)
memoizedFn(1, 2); // 2 (first call: cold)
Enter fullscreen mode Exit fullscreen mode

You see here that when we call the memoizedFn function for the second time, we expect that it just returns the cached value but as our implementation is not finished, it calls the fn function again for the same input values.

To memoize the function, I will use the Map as a cache object. So every time the function is called, we can access the cache object using the arguments as a key and get the output.

The cache object will look like this:

const cache = new Map();

cache.set('1', 1);
cache.get('1'); // 1

cache.set('2', 2);
cache.get('2', 2);
Enter fullscreen mode Exit fullscreen mode

The key of the cache object will be the arguments of the function and the value will be the output of the function call. So every time we call the memoized function again, we can access the cache object before calling the function.

The algorithm is pretty simple:

  • get all the arguments and "stringify" them to build the key of the cache object
  • verify if the key exists in the cache.
    • If it does
    • return the output value
    • if it doesn't
    • call the function
    • store the output value in the cache
    • and return the output value

Now implementing each part:

  • get the arguments: a simple function we should return, and get the arguments using the spread operator
(...args) => {};
Enter fullscreen mode Exit fullscreen mode
  • stringify the arguments to build the key of the cache object by using the JSON.stringify API
const key = JSON.stringify(args);
Enter fullscreen mode Exit fullscreen mode
  • verify if the key exists in the cache. If it does, return the output value
if (cache.has(key)) {
  return cache.get(key);
}
Enter fullscreen mode Exit fullscreen mode
  • if it doesn't: call the function, store the output value in the cache
const result = fn(...args);
cache.set(key, result);
Enter fullscreen mode Exit fullscreen mode

And this is the final version:

function memoize(fn) {
  const cache = new Map();

  return (...args) => {
    const key = JSON.stringify(args);

    if (cache.has(key)) {
      return cache.get(key);
    }

    const result = fn(...args);
    cache.set(key, result);

    return result;
  };
}
Enter fullscreen mode Exit fullscreen mode

To do the performance comparison and make sure our memoization function works, let's see the example of sum and factorial functions.

The sum is a pretty simple function and doesn't cost much, so we wouldn't see any significant improvements after caching the function calls.

function sum(a, b) {
  return a + b;
}
Enter fullscreen mode Exit fullscreen mode

And now calling it:

const memoizedSum = memoize(sum);

memoizedSum(1, 1); // 2 (not cached)
memoizedSum(1, 1); // 2 (cached)
Enter fullscreen mode Exit fullscreen mode

To do a better comparison and make sure the memoization speed up the function calls, I created two simple helper functions to build the testing case.

  • getNumbers is a generator of all the numbers we want to test as inputs for the memoization function.
function getNumbers(limit = 1000000) {
  let numbers = [];

  for (let i = 0; i < limit; i++) {
    numbers.push(i);
  }

  return numbers;
}
Enter fullscreen mode Exit fullscreen mode
  • testSum is a function to test the execution time of a given callback function.
function testSum(label, numbers, callback) {
  console.time(label);
  for (let number of numbers) {
    callback(number, number);
  }
  console.timeEnd(label);
}
Enter fullscreen mode Exit fullscreen mode

So let's test it: Calling the getNumbers, we get an array of 1.000.000 numbers.

const numbers = getNumbers();
Enter fullscreen mode Exit fullscreen mode

Calling the testSum passing the memoized function:

  • cold call
  • cached call
testSum('cold', numbers, memoizedSum);
testSum('cached', numbers, memoizedSum);
Enter fullscreen mode Exit fullscreen mode

Running in my machine (MacBook Pro (13-inch, M1, 2020), Chip Apple M1, Memory 16GB), I got this comparison.

------- sum --------
cold: 495.026ms
cached: 371.011ms
------- // --------
Enter fullscreen mode Exit fullscreen mode

The memoized version is 1.33% faster than the pure version.

As I mentioned earlier, the sum function is a simple function therefore its execution doesn't cost that much, so we won't see a lot of performance improvements in the cached version.

But now, let's see the factorial function being compared to its memoized version. As it's a somewhat more complex function, we'll probably see the memoized version sped up.

function factorial(number) {
  if (number < 0) return -1;
  if (number === 0) return 1;
  return number * factorial(number - 1);
}
Enter fullscreen mode Exit fullscreen mode

Without a caching mechanism, we can implement the factorial function using the recursion technique.

  • if the number is smaller than zero: returns -1
  • if the number is zero: returns 1
  • otherwise, return the number times the factorial of the number - 1

For the test, this is our memoized version of the factorial:

const memoizedFactorial = memoize(factorial);
Enter fullscreen mode Exit fullscreen mode

Let's call it and compare the cold and the cached versions.

Similar to the testSum function, I created a testFactorial function to handle the testing.

function testFactorial(label, numbers, callback) {
  console.time(label);
  for (let number of numbers) {
    callback(number);
  }
  console.timeEnd(label);
}
Enter fullscreen mode Exit fullscreen mode

It's very similar to the testSum function, the only difference is that the callback only receives one parameter.

Running these two times:

testFactorial('cold', numbers, memoizedFactorial);
testFactorial('cached', numbers, memoizedFactorial);
Enter fullscreen mode Exit fullscreen mode

We get this performance comparison (running on a MacBook Pro (13-inch, M1, 2020), Chip Apple M1, Memory 16GB machine):

------ factorial ------
cold: 572.349ms
cached: 1.919ms
--------- // ---------
Enter fullscreen mode Exit fullscreen mode

The memoized version is 298% faster than the pure version.

Closing thoughts

Memoization and other performance optimization approaches are always interesting. And one of the best ways to deeply understand anything is by building from scratch.

We have implemented the memoization algorithm step by step, we tested the function with the sum function, and saw some performance improvements but not that much, as the sum function is pretty simple.

We also tested with a more complex function, the factorial function. In this performance test, we could see a lot of improvements and how this concept is powerful.

Top comments (3)

Collapse
 
jonrandy profile image
Jon Randy πŸŽ–οΈ

Isn't your memoizedFactorial function calling the original factorial function?

Shouldn't you be doing:

const factorial = memoize(factorial)
Enter fullscreen mode Exit fullscreen mode

instead of:

const memoizedFactorial = memoize(factorial)
Enter fullscreen mode Exit fullscreen mode

so that the memoized version will call itself and not the original?

Collapse
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
jonrandy profile image
Jon Randy πŸŽ–οΈ

It is what I'm talking about, and my code sample is a simple way to do it.