loading...
Cover image for Understand Memoization in 5 Minutes

Memoization Understand Memoization in 5 Minutes

nas5w profile image Nick Scialli (he/him) Originally published at typeofnan.dev ・3 min read

Please give this post a πŸ’“, πŸ¦„, or πŸ”– if it you enjoyed it!

Memoization is another intimidating term that becomes quite intuitive when you understand it. Today, let's learn what memoization is!


A Couple Notes

  • I made a video version of this tutorial! Check it out here.
  • If you like this post, please consider subscribing to my free weekly webdev newsletter!

Introduction

Memoization is an optimization technique used in many programming languages to reduce the number of redundant, expensive function calls. This is done by caching the return value of a function based on its inputs. In this post, we'll create a suboptimal, but hopefully educationally-informative, JavaScript function memoizer!

First, an Expensive Function to Memoize

Here's a function for us to memoize. It finds the square of a number in a very inefficient way.

const inefficientSquare = num => {
  let total = 0;
  for (let i = 0; i < num; i++) {
    for (let j = 0; j < num; j++) {
      total++;
    }
  }
  return total;
};

We can run this function with the same value and, each time, it'll take a while to execute.

const start = new Date();
inefficientSquare(40000);
console.log(new Date() - start);
// 1278

const start2 = new Date();
inefficientSquare(40000);
console.log(new Date() - start2);
// 1245

Over one second each time, yikes!

Writing Pseudocode for our Memoizer

Let's reason through our memoizer before we write any code.

  • Takes a reference to a function as an input
  • Returns a function (so it can be used as it normally would be)
  • Creates a cache of some sort to hold the results of previous function calls
  • Any future time calling the function, returns a cached result if it exists
  • If the cached value doesn't exist, calls the function and store that result in the cache

Real Code Time

Here's an implementation of the above pseudocode outline. As mentioned in the introduction, this is suboptimal and you should not use this in production. I'll explain why after!

// Takes a reference to a function
const memoize = func => {
  // Creates a cache of results
  const results = {};
  // Returns a function
  return (...args) => {
    // Create a key for results cache
    const argsKey = JSON.stringify(args);
    // Only execute func if no cached value
    if (!results[argsKey]) {
      // Store function call result in cache
      results[argsKey] = func(...args);
    }
    // Return cached value
    return results[argsKey];
  };
};

The most suboptimal part of this implementation, and why I wouldn't recommend it be used in production code, is using JSON.stringify to create keys in our results cache. The biggest problem with JSON.stringify is that it doesn't serialize certain inputs, like functions and Symbols (and anything you wouldn't find in JSON).

Testing Our Memoizer on an Expensive Function

Let's replicate our inefficientSquare example, but this time we'll use our memoizer to cache results.

const memoize = func => {
  const results = {};
  return (...args) => {
    const argsKey = JSON.stringify(args);
    if (!results[argsKey]) {
      results[argsKey] = func(...args);
    }
    return results[argsKey];
  };
};

const inefficientSquare = memoize(num => {
  let total = 0;
  for (let i = 0; i < num; i++) {
    for (let j = 0; j < num; j++) {
      total++;
    }
  }
  return total;
});

const start = new Date();
inefficientSquare(40000);
console.log(new Date() - start);
// 1251

const start2 = new Date();
inefficientSquare(40000);
console.log(new Date() - start2);
// 0

Success! The second time we call inefficientSquare with the same input it takes no time to recompute; we're simply pulling the cached value from an object.

Only Memoize Pure Functions!

Memoization is great, but it only works if your function is pure. In other words, if your function's returned value is dependent on more than its inputs, then your cached value for those inputs won't always be correct. Also, if your function has side-effects, the memoizer doesn't replicate those, it simply returns the ultimately-returned function value.

Conclusions

You should now have a good idea of how and why we use memoization! While our memoization function was sub-optimal, there are plenty of third party libraries out there you can use that will do a much better. Just make sure the functions you're memoizing are pure!

Posted on Jun 25 by:

nas5w profile

Nick Scialli (he/him)

@nas5w

Husband, dog dad, software engineer, coffee monster. Working in civic tech!

Discussion

markdown guide
 

Good article and video!
Just wanted to note to consider using more precise way when measuring NodeJS performance with process.hrtime that returns tuple Array [seconds, nanoseconds].
When using Date and console.log in between measurements, there are additional tasks that impacts results with no need. For more complicated examples it could involve background optimization and inconsistent results.

Here is how you can use process.hrtime for your example

var st1 = process.hrtime();
inefficientSquare(40000);
var e1 = process.hrtime();
var r10 = e1[0] - st1[0];
var r11;
if (r10 > 0) { // inverting sign if execution is measured between 2 different seconds
  r11 = - e1[1] + st1[1];
} else {
  r11 = e1[1] - st1[1];
}

var st2 = process.hrtime();
inefficientSquare(40000);
var e2 = process.hrtime();
var r20 = e2[0] - st2[0];
var r21;
if (r20 > 0) { // inverting sign if execution is measured between 2 different seconds
  r21 = - e2[1] + st2[1];
} else {
  r21 = e2[1] - st2[1];
}

// this will log diff between 2 points in time in nanoseconds and will be precise if execution time is less then 1 second, for larger executions you should include r10 and r20;
console.log(r11); // o: 197783200 nanoseconds ~= 198 miliseconds
console.log(r21); // o:  200077600 ~= 200 miliseconds
 

Very nice writing. One thing to add, our application might use too much memory in case we cache everything(ex: a pod of a deployment on our k8s cluster that runs for a long time). Maybe cleaning up the cache from the not frequently used ones might work in that case. Also that would also require keeping another variable in for cached records sth like number times function called with these parameters.

 

Plz avoid using JSON.stringify for memoization keys.

  • It doesn't support anything except json-likes things
  • it's extremely slow

But you can use:

  • Maps for objects keys
  • WeakMaps for objects avoiding GC
  • Custom non-object keys
 

I appreciate the note.

I did already call out that JSON.stringify is suboptimal in the article.

 

Yet after you did that, you never gave an optimal alternative. You said it was "suboptimal" yet repeated it in your "optimal" solution.

 

Hi Nick!
Thanks for the post!
Could someone clarify me please, why "const results = {}" is stays the same on each call of method "memoize()"? I thought it will be created each time on call "memoize()" and this approach wouldn't work, but seems like it works! :)

Thanks.

 
 

Yeah... i didn't notice that "memoize()" called only once.