Richard Feldman

Posted on Feb 10, 2020 • Originally published at blog.noredink.com

What would you pay for type checking?

#types

Here’s a statement that shouldn’t be controversial, but is anyway: JavaScript is a type-checked language.

I’ve heard people refer to JavaScript as “untyped” (implying that it has no concept of types), which is odd considering JS's most infamous error—“undefined is not a function”—is literally an example of the language reporting a type mismatch. How could a supposedly “untyped” language throw a TypeError? Is JS aware of types or isn't it?

Of course, the answer is that JavaScript is a type-checked language: its types are checked at runtime. The fact that the phrase “JavaScript is a type-checked language” can be considered controversial is evidence of the bizarre tribalism we’ve developed around when types get checked. I mean, is it not accurate to say that JavaScript checks types at runtime? Of course it's accurate! Undefined is not a function!

Truly Untyped Languages

Assembly language does not have “undefined is not a function.”

This is because it has neither build time nor runtime type checking. It’s essentially a human-readable translation of machine code, allowing you to write add instead of having to handwrite out the number corresponding to an addition machine instruction.

So what happens if you get a runtime type mismatch in Assembly? If it doesn’t check the types and report mismatches, like JavaScript does, what does it do?

Let’s suppose I’ve written a function that capitalizes the first letter in a lowercase string. I then accidentally call this code on a number instead of a string. Whoops! Let’s compare what would happen in JavaScript and in Assembly.

Since Assembly doesn't have low-level primitives called “number” or “string,” let me be a bit more specific. For “number” I’ll use a 64-bit integer. For “string” I’ll use the definition C would use on a 64-bit system, namely “a 64-bit memory address pointing to a sequence of bytes ending in 0.” To keep the example brief, the function will assume the string is ASCII encoded and already begins with a lowercase character.

The assembly code for my “capitalize the first letter in the string” function would perform roughly the following steps.

Treat my one 64-bit argument as a memory address, and load the first byte from memory at that address.
“Capitalize” that byte by subtracting 32 from it. (In ASCII, subtracting 32 from a lowercase letter’s character code makes it uppercase.)
Write the resulting byte back to the original memory address.

If I call this function passing a “string” (that is, a memory address to the beginning of my bytes), these steps will work as intended. The function will capitalize the first letter of the string. Yay!

If I call this function passing a normal integer…yikes. Here are the steps my Assembly code will once again faithfully perform:

Treat my one 64-bit argument as a memory address, even though it’s actually supposed to be an integer. Load the first byte from whatever memory happens to be at that address. This may cause a segmentation fault (crashing the program immediately with the only error information being “Segmentation fault”) due to trying to read memory the operating system would not allow this process to read. Let’s proceed assuming the memory access happened to be allowed, and the program didn’t immediately crash.
“Capitalize” whatever random byte of data we have now loaded by subtracting 32 from it. Maybe this byte happened to refer to a student's test score, which we just reduced by 32 points. Or maybe we happened to load a character from the middle of a different string in the program, and now instead of saying “Welcome, Dave!” the screen says “Welcome, $ave!” Who knows? The data we happen to load here will vary each time we run the program.
Write the resulting byte back to the original memory address. Sorry, kid - your test score is just 32 points lower now.

Hopefully we can all agree that “undefined is not a function” is a significant improvement over segmentation faults and corrupting random parts of memory. Runtime type checking can prevent memory safety problems like this, and much more.

Bytes are bytes, and many machine instructions don’t distinguish between bytes of one type or another. Whether done at build time or at runtime, having some sort of type checking is the only way to prevent disaster when we’d otherwise instruct the machine to interpret the bytes the wrong way. “Types for bytes” was the original motivation for introducing type checking to programming, although it has long since grown beyond that.

Objective Costs of Checking Types

It’s rare to find discussions of objective tradeoffs in the sea of “static versus dynamic” food fights, but this example actually illustrates one.

As the name suggests, runtime type checking involves doing type checking…at runtime! The reason JavaScript wouldn’t cause a segmentation fault or corrupt data in this example, where Assembly would, is that JavaScript would generate more machine instructions than the Assembly version. Those instructions would record in memory the types of each value, and then before performing a certain operation, first read the type out of memory to decide whether to proceed with the operation or throw an error.

This means that in JavaScript, a 64-bit number often takes up more than 64 bits of memory. There’s the memory needed to store the number itself, and then the extra memory needed to store its type. There’s also more work for the CPU to do: it has to read that extra memory and check the type before performing a given operation. In Python, for example, a 64-bit integer takes up 192 bits (24 bytes) in memory.

In contrast, build time type checking involves doing type checking…at build time! This does not have a runtime cost, but it does have a build-time cost; an objective downside to build-time type checking is that you have to wait for it.

Programmer time is expensive, which implies that programmers being blocked waiting for builds is expensive. Elm’s compiler builds so fast that at NoRedInk we’d have paid a serious “code’s compiling” productivity tax if we had chosen TypeScript instead—to say nothing of what we’d have missed in terms of programmer happiness, runtime performance, or the reliability of our product.

That said, using a language without build-time checking will not necessarily cause you to spend less time waiting. Stripe’s programmers would commonly wait 10-20 seconds for one of their Ruby tests to execute, but the Ruby type checker they created was able to give actionable feedback on their entire code base in that time. In practice, introducing build-time type checking apparently led them to spend less time overall on waiting.

Performance Optimizations for Type Checkers

Both build time and runtime type checkers are programs, which means their performance can be optimized.

For example, JIT compilers can reduce the cost of runtime type checking. JavaScript in 2020 runs multiple orders of magnitude faster than JavaScript in 2000 did, because a massive effort has gone into optimizing its runtime. Most of the gains have been outside the type checker, but JavaScript’s runtime type checking cost has gone down as well.

Conversely, between 2000 and 2020 JavaScript’s build times have exploded—also primarily outside type checking. When I first learned JavaScript (almost 20 years ago now, yikes!) it had no build step. The first time I used JS professionally, the entire project had one dependency.

Today, just installing the dependencies for a fresh React project takes me over a minute—and that’s before even beginning to build the project itself, let alone type check it! By contrast, I can build a freshly git-cloned 4,000-line Elm SPA in under 1 second total, including installing dependencies and full type checking.

While they may improve performance overall, JIT compilers introduce their own runtime costs, and cannot make runtime type checking free. Arguably Rust‘s main reason for existence is to offer a reliable and ergonomic programming language which does not introduce the sort of runtime overhead that come with JIT compilers and garbage collectors.

Build time type checkers are also programs, and their performance can also be optimized.

We often lump build-time type checking performance into the bucket of “compilation time,” but type checking isn’t necessarily the biggest contributor to slow builds. For example, in the case of Rust, code generation is apparently a much bigger contributor to compile times than type checking—and code generation only begins after type checking has fully completed.

Some type checkers with essentially equivalent type systems build faster than others, because of performance optimization. For example, the 0.19.0 release of Elm did not change the type system at all, but massively improved build times by implementing certain performance optimizations which (among other things) made part of type inference take O(1) time instead of O(log(n)) time.

Type Systems Influence Performance

Type system design decisions aren’t free! At both build time and runtime, type checking performance is limited by the features of the type system itself.

For example, researchers have developed type inference strategies that run very fast, but these strategies rely on some assumptions being true about the design of the type system. Introducing certain subtyping features can invalidate these strategies, so offering such features lowers the ceiling on how fast the compiler can be—and for that matter, whether it can offer type inference.

It’s easy to quip “you could guarantee that at build time using ________ types” (fill in the blank with something like linear types, refinement types, dependent types, etc.) but the impact this would have on compilation times is less often discussed.

If your language introduced a given type system feature tomorrow, what would the impact be on compile times? Has anyone developed a way to check those types quickly? How much value does a given feature need to add to compensate for the swordfighting downtime it brings along with it?

Runtime type checkers are subject to these tradeoffs as well. Python and Clojure have different type systems, for example. So do Ruby and Elixir, and JavaScript and Lua. The degree to which their performance can be optimized (by JIT compilers, for example) depends in part on the design of these type systems.

Because it’s faster to check some type system features at runtime than at build time, these forces combine to put performance caps on languages which add build time checking to type systems which were designed only with runtime type checking in mind. For example, TypeScript’s compiler could run faster if it did not need to accommodate JavaScript’s existing type system.

What Would You Pay?

Except when writing in a truly untyped language like Assembly, we’re all paying for type checking somewhere—whether at build time, at runtime, or both. That cost varies based on what performance optimizations have been done (such as build-time algorithmic improvements and runtime JITs), and while type system design choices can restrict which optimizations are available, they don’t directly cause performance to be fast or slow.

Programming involves weighing lots of tradeoffs, and it’s often challenging to anticipate at the beginning of a project what will cause problems later. “The build runs too slowly” and “the application runs too slowly” are both serious problems to have, and which programming language you choose puts a cap on how much you can improve either.

We all have different tolerances for how much we’re willing to pay for this checking, and what we expect to get out of it. It’s worth thinking critically about these tradeoffs, to make conscious decisions rather than choosing the same technology we chose last time because it’s familiar.

So the next time you’re starting a project, think about these costs and benefits up front. What would you pay for type checking?

Thanks to Brian Hicks, Christoph Hermann, Charlie Koster, Alexis King, and Hillel Wayne for reading drafts of this.

Top comments (10)

Kris Nuttycombe • Feb 10 '20

The cost in build time has to be weighed against the costs of debugging an error at later stages. How much does it cost to find logs, check out some code you haven’t read in a while, write a test to reproduce the error that you couldn’t make unrepresentable under a typing discipline, fix the error, get the fix reviewed and finally release it to production?

Of course, to do this calculation you need to know the probability of error, which cannot be estimated, but must be determined empirically. And you have to factor in programmer time to learn how to use your type system to make such errors unrepresentable, and mistakes in attempts to do so, which can impose a large time cost.

I wish we had better data to inform these decisions. How many errors could have been statically made impossible? How much time has been spent on fancily-typed wild goose chases? Perhaps we need a shared way of collecting this data other than just our accumulated individual anecdotes.

Yawar Amin • Feb 11 '20

Studies have been done–it's been known for a long time that errors caught later in the development lifecycle are exponentially more expensive to fix. See embedded.typepad.com/bughunter/err...

daanchuk • Feb 10 '20

Any ideas how we could collect such data?

Adam Crockett 🌀 • Feb 10 '20 • Edited

You would need to start 2 identical projects, counting time to debug and runtime programmer mistakes found in plain js, Vs total time to build, debug and programmer mistakes in something like typescript in the strictest style. The projects would need to be simple to multiply that by lines of code to get an average payoff of a forecasted project size X. Take it further by calculating employee wage over time. Unfortunately this experiment doesn't account for programmer skill and diligence.

Kris Nuttycombe • Feb 10 '20

Ideally, since it’s of most interest to project leads and managers, they’d be the ones to collect it. I feel like the opportunity to track and share data like this ought to be an integrated feature of projects management tools.

Florian Rappl • Feb 11 '20

Who said JS is untyped?! The definition is that JS is dynamically typed, i.e., types can in general only be seen at runtime. This is in contrast to static typing where the compiler knows all the types in advance.

While the performance argument is legit I think the real question is: "how many bugs need to be caught upfront to justify additional tooling?". I think a single one already justifies slower builds.

Karol Marcjan • Feb 14 '20

AFAIK people who say that are pedantically using a specific, technical definition of what a type is that was never intended to be used in all contexts.

Patryk • Feb 11 '20

Sorry, kid - your test score is just 32 points lower now.

It levels the playing field. If you scored less than 32 points, you've now got something close to 2 ** 63 on your test. 😎

Eljay-Adobe • Feb 10 '20

I've built a very large application in TypeScript. The build time was a non-issue.

That being said, I'm biased in favor of Elm.

Patrick Charles-Lundaahl • Feb 10 '20

Thanks for sharing, Richard! There's some fantastic food for thought here, even if a lot of the time we don't get a say in the language our employer uses.