Every year, I like to teach myself a new language by working through the Advent of Code problems. This year, I wanted to dive into typescript. I've written a fair amount of JS before, and I like having explicit typechecking in my code, so TS felt like a good fit.

Everything was going great until I hit Day 15, and my solution worked well for the small case but was painfully slow in the large case.

You should definitely click through and read the problem statement, but to summarize:

You start off with a comma-separated list of numbers. This is your problem input. The challenge is to continue the sequence of numbers. Each number in the sequence (after the starting set) depends on the number immediately previous to it. If that previous number has never appeared in the sequence before, our new number is `0`

. If it *has* appeared in the sequence, our new number is the number of steps between when it last appeared in the sequence and now.

For example, let's say we start with `1,2,3`

as our input. The next numbers would be:

- To determine the fourth number, we'd look at the third number (
`3`

).`3`

has never appeared in the sequence before, so the fourth number is`0`

. - To determine the fifth number, we'd look at the fourth number (
`0`

).`0`

has never appeared in the sequence before, so the fifth number is`0`

. - To determine the sixth number, we look at the fifth number (
`0`

).`0`

has appeared in the sequence before, as the fourth number, so we take the difference (position 5 - position 4 = 1) and that difference is our sixth number:`1`

. - To determine the seventh number, we look at the sixth number (
`1`

).`1`

has appeared in the sequence before, so we subtract and get`5`

.

We can continue this pattern forever.

(Numberphile has a wonderful overview of this sequence that goes into a bit more detail)

Part 1 of the challenge is relatively simple: determine the 2020th number in the sequence. I decided to brute force it:

```
const solve = (input: string, end: number): number => {
const nums = input.split(',').map((l) => parseInt(l));
const spoken: Record<number, number> = {};
let i = 0;
let next = -1;
while (i < end - 1) {
const current = i < nums.length ? nums[i] : next;
if (spoken[current] === undefined) {
next = 0;
} else {
next = i - spoken[current];
}
spoken[current] = i;
i++;
}
return next;
};
```

The `spoken`

object is a `Record<number, number>`

where the keys are numbers and the values are the last index where that number appeared in the sequence. (Why "spoken"? See the Advent of Code description). On each iteration, we check if our *current* number has been spoken before, and set our `next`

number accordingly. On the next iteration of the loop, `current`

becomes `next`

, and so we repeat this process for as many iterations as we need.

So far, so good, but let's talk efficiency, because part 2 of the challenge is to calculate the 30 *millionth* number in the sequence. Performance-wise, I'd claim that this runs in O(n) time; the run time scales linearly with the number of values we want to compute. ie. Computing 2000 numbers takes twice as long as computing 1000 numbers. This is because we use that `spoken`

object to do fast lookups on the last index a number appeared in.

This is because we generally treat insertion and lookups in an object as "cheap" or constant time.

The drawback is that we're using tons of memory. We'd describe it as using O(n) memory; the amount of space we use grows linearly with the number of values we want to compute. (I have a proof for this but it's too large to fit in this margin...just kidding. As per the Numberphile video I referenced earlier, this sequences seems like it grows linearly, but no one has proven that yet.)

Since we're brute forcing the result, at the very least, we need to do one loop iteration per number we compute, so this O(n) runtime is the fastest we can get. If we had a magic math formula that could spit out the number at an arbitrary index, that would be much faster, but I don't have one (and as per that Numberphile video, neither do the mathematicians!).

So, it sounds like we're very close to our theoretical peak performance. Let's try and compute the 30 millionth number then.

And...when I do, it hangs. For upwards of 5 *minutes*.

Sounds like something is wrong. Very very wrong. I'm not sure where, though, so let's do some profiling.

```
const start_str = '1,2,3';
[2000, 20000, 200000, 2000000, 20000000].forEach((total) => {
const start = new Date().getTime();
solve(start_str, total);
console.log(`took ${new Date().getTime() - start} ms to compute ${total}`);
});
```

Basically, we're timing how long this method takes for successively larger and larger amounts of numbers to generate. If our solution was O(n), as I claimed above, we should see times that grow linearly; if 2000 takes 1ms, then 20,000 should take 10ms and 200,000 should take 100ms.

Instead, these are the results I measured:

```
took 0 ms to compute 2000
took 6 ms to compute 20000
took 90 ms to compute 200000
took 8552 ms to compute 2000000
took 351866 ms to compute 20000000
```

🤯🤯

The jump from 200,000 to 2,000,000 is particularly interesting: it took us 95 times longer to compute 10x the numbers! And going from 2,000,000 to 20,000,000 is also pretty bad: 10x the work took 41 times longer.

(Before I go any further, a quick note about performance metrics: all of these numbers were collected on my 2016 MacBook Pro running macOS 11.2, with a 3.1GHz i5 in it. I'm using node version `14.15.1`

and tsc `4.1.2`

. The absolute values of these numbers isn't important but what is important is the proportions.)

So, what's going on with these numbers? When I first wrote this code and saw these, I was extremely surprised. We already said it looks O(n), so why are we seeing huge non-linear spikes? Spoiler alert: my code wasn't O(n).

Yeah, that's right. It's not O(n). Why not, you ask?

Well, let's find out. Time to do some more profiling:

```
const solve = (input: string, end: number): number => {
const nums = input.split(',').map((l) => parseInt(l));
const spoken: Record<number, number> = {};
let i = 0;
let next = -1;
let read = 0;
let write = 0;
while (i < end - 1) {
const current = i < nums.length ? nums[i] : next;
let start = new Date().getTime();
const last_spoken = spoken[current];
read += new Date().getTime() - start;
if (last_spoken === undefined) {
next = 0;
} else {
next = i - last_spoken;
}
start = new Date().getTime();
spoken[current] = i;
write += new Date().getTime() - start;
i++;
}
console.log(`reading took ${read} ms`);
console.log(`writing took ${write} ms`);
return next;
};
```

We do two things with our `spoken`

lookup object: we read a value out and write a value in. Here I've added code to time how long each part takes, in total, across the entire operation. The results:

```
reading took 0 ms
writing took 1 ms
took 11 ms to compute 2000
reading took 5 ms
writing took 7 ms
took 31 ms to compute 20000
reading took 55 ms
writing took 130 ms
took 289 ms to compute 200000
reading took 547 ms
writing took 9463 ms
took 10935 ms to compute 2000000
reading took 4480 ms
writing took 346699 ms
took 365872 ms to compute 20000000
```

Logs can be hard to read, so let's dump this into a table:

2,000 | 20,000 | 200,000 | 2,000,000 | 20,000,000 | |
---|---|---|---|---|---|

read | 0 | 5 | 55 | 547 | 4480 |

write | 1 | 7 | 130 | 9463 | 346699 |

As you can see, read performance is fine, roughly linear. Write performance is drastically worse as our object gets bigger and bigger.

There isn't much more we can do here; we've gathered as much data as we can. We're observing something in the behavior of the language and runtime we're using, so now it's time to go see if anyone else has observed similar behavior, or if we've found a bug in node.

Armed with this information, off to the interwebs we go. Searching for "js object vs map performance" turns up a number of results of people *profiling* objects and maps, but I didn't find anything along the lines of what I was seeing. Then, I came across GitHub user `jngbng`

, who did similar experiments that suggest objects perform *faster* than Sets when there is a high number of common elements in the data, but much slower than Sets when most of the elements in the data are unique.

That led me to a more subtle insight here about our code. Object performance is demonstrably worse when there is a high variance in the keys...and as we know, the sequence we're computing grows continuously, which means we must be constantly seeing new numbers and adding them to our object. Thus, we must have a high variance in the keys we're adding into our object.

This insight led me to an article by Camillo Bruni, an engineer working on the V8 JavaScript engine at Google, on how V8 handles objects gaining new properties. He also linked to an article by Vyacheslav Egorov, a compiler engineer at Google, on how V8 handles property lookups on objects.

I will confess, a lot of the details in those two posts went over my head, but the summary is this: the V8 runtime is optimized for cases where you have many objects with the same sets of properties on them. Constantly adding new keys to our object breaks V8's property caches and then forces it to rebuild them each time we add a property to the object, which makes inserting new keys really slow. This is exactly what `jngbng`

found: objects with a small number of keys (or rarely-changing sets of keys) perform faster than Sets with the same keys.

In our scenario, we are adding new keys (the numbers we compute) to our object very frequently, meaning we very quickly get into the range where we frequently defeat the V8 Object property lookup caches!

We can actually confirm this with some more profiling:

```
const solve = (input: string, end: number): number => {
const nums = input.split(',').map((l) => parseInt(l));
const spoken: Record<number, number> = {};
let i = 0;
let next = -1;
let read = 0;
let insert = 0;
let num_inserts = 0;
let update = 0;
let num_updates = 0;
while (i < end - 1) {
const current = i < nums.length ? nums[i] : next;
let start = new Date().getTime();
const last_spoken = spoken[current];
read += new Date().getTime() - start;
if (last_spoken === undefined) {
next = 0;
start = new Date().getTime();
spoken[current] = i;
insert += new Date().getTime() - start;
num_inserts++;
} else {
next = i - last_spoken;
start = new Date().getTime();
spoken[current] = i;
update += new Date().getTime() - start;
num_updates++;
}
i++;
}
console.log(`reading took ${read} ms`);
console.log(`inserted ${num_inserts} times for a total of ${insert} ms and ${insert / num_inserts} ms on avg`);
console.log(`updated ${num_updates} times for a total of ${update} ms and ${update / num_updates} ms on avg`);
return next;
};
```

Now, instead of timing all of the write operations as one block, we measure the insert operations (ie. the cases where we write to new keys) separately from updates. We also need to track the *number* of inserts and updates we do, to make sure that if inserts are taking longer, it isn't because we just did more of them.

Sure enough, our numbers confirm our theory:

```
reading took 0 ms
inserted 380 times for a total of 0 ms and 0 ms on avg
updated 1619 times for a total of 1 ms and 0.0006176652254478073 ms on avg
took 34 ms to compute 2000
reading took 13 ms
inserted 3285 times for a total of 2 ms and 0.0006088280060882801 ms on avg
updated 16714 times for a total of 13 ms and 0.0007777910733516812 ms on avg
took 58 ms to compute 20000
reading took 41 ms
inserted 29247 times for a total of 54 ms and 0.001846343214688686 ms on avg
updated 170752 times for a total of 32 ms and 0.0001874062968515742 ms on avg
took 223 ms to compute 200000
reading took 389 ms
inserted 265514 times for a total of 7768 ms and 0.029256461052901164 ms on avg
updated 1734485 times for a total of 302 ms and 0.000174115083151483 ms on avg
took 9216 ms to compute 2000000
reading took 4429 ms
inserted 2441404 times for a total of 328123 ms and 0.1343993046623992 ms on avg
updated 17558595 times for a total of 4353 ms and 0.0002479127743421384 ms on avg
took 348712 ms to compute 20000000
```

And as a table:

2,000 | 20,000 | 200,000 | 2,000,000 | 20,000,000 | |
---|---|---|---|---|---|

average insert | 0 | 0.0006 | 0.0018 | 0.0293 | 0.1344 |

average update | 0.0006 | 0.0008 | 0.0002 | 0.0002 | 0.0002 |

(numbers have been rounded to 4 decimal places)

We can actually watch the average insertion time grow steadily as our object grows. Interestingly, although we seem to consistently update existing values 10x more than we insert new ones, by the time we generate 200,000 values, we're spending more time on inserts than on updates!

And notice how the average update time stays relatively constant? That's V8's property caching optimization in action. When the object doesn't change "shape" (ie. gain new keys), reads and writes are constant time.

So what's our fix? Simple: swap the `{}`

for a `Map`

:

```
const solve = (input: string, end: number): number => {
const nums = input.split(',').map((l) => parseInt(l));
const spoken: Map<number, number> = new Map();
let i = 0;
let next = -1;
while (i < end - 1) {
const current = i < nums.length ? nums[i] : next;
if (!spoken.has(current)) {
next = 0;
} else {
next = i - spoken.get(current);
}
spoken.set(current, i);
i++;
}
return next;
};
```

This implementation performed much much better:

```
took 0 ms to compute 2000
took 15 ms to compute 20000
took 38 ms to compute 200000
took 262 ms to compute 2000000
took 4170 ms to compute 20000000
```

Now the 10x growth from 200,000 to 2,000,000 only took 7x longer, and the 10x growth from 2,000,000 to 20,000,000 took 15x longer. Both much more reasonable multipliers than what we were seeing before.

If we make a similar change to time the individual sections of the code, we get logs like this:

```
reading took 2 ms
inserted 380 times for a total of 0 ms and 0 ms on avg
updated 1619 times for a total of 0 ms and 0 ms on avg
took 3 ms to compute 2000
reading took 6 ms
inserted 3285 times for a total of 1 ms and 0.00030441400304414006 ms on avg
updated 16714 times for a total of 4 ms and 0.00023932033026205577 ms on avg
took 28 ms to compute 20000
reading took 39 ms
inserted 29247 times for a total of 6 ms and 0.00020514924607652066 ms on avg
updated 170752 times for a total of 31 ms and 0.00018154985007496252 ms on avg
took 170 ms to compute 200000
reading took 523 ms
inserted 265514 times for a total of 65 ms and 0.0002448081833726282 ms on avg
updated 1734485 times for a total of 278 ms and 0.00016027812290103402 ms on avg
took 1383 ms to compute 2000000
reading took 6507 ms
inserted 2441404 times for a total of 775 ms and 0.0003174402925529736 ms on avg
updated 17558595 times for a total of 3271 ms and 0.0001862905317879933 ms on avg
took 16278 ms to compute 20000000
```

Add these new numbers to our table from before:

2,000 | 20,000 | 200,000 | 2,000,000 | 20,000,000 | ||
---|---|---|---|---|---|---|

object | average insert | 0 | 0.0006 | 0.0018 | 0.0293 | 0.1344 |

average update | 0.0006 | 0.0008 | 0.0002 | 0.0002 | 0.0002 | |

map | average insert | 0 | 0.0003 | 0.0002 | 0.0002 | 0.0003 |

average update | 0 | 0.0002 | 0.0002 | 0.0002 | 0.0002 |

(again, numbers have been rounded to 4 decimal places)

That performance on a `Map`

is much much better, and much more even. Notice how the difference in average insertion time and average update time barely differs on the `Map`

, no matter how large we get? That's the very definition of O(1) performance.

We can even go further with this-- we could swap in a pre-sized array instead of a `Map`

and get even faster performance. Leave a comment below if you know what this would look like :)

Anyways..what's our takeaway here? What did we learn?

Number 1 is the most obvious: for data sets with large numbers of keys, prefer `Map`

over `{}`

.

But it's not that simple: it's important to verify our assumptions. When I first wrote my function, performance analysis suggested that we had written Performant™️ code, but when we ran it, it choked. And despite the mental model that I had around objects, which I think is pretty common among JS programmers, the authors of the JS implementation you're using might make different tradeoffs that break those mental models.

In our case, the V8 runtime is optimized for the "many objects with the same keys" case instead of the "single object with many keys" case, and that tradeoff made our code much slower. So our second takeaway here is this: our mental models of the world aren't always accurate, and it's important to verify those models.

Third, everything is fast for small n. Our object-based algorithm ran just fine on the small case (computing ~2000 values) but fell apart for the larger case (~30 million values). And this was in a case where we actively were thinking about performance!

Have you run into cases like this before? Found any other accidentally n^2 algorithms? Let me know on twitter or in the comments below.

## Discussion (0)