Adam Nathaniel Davis

Posted on Mar 8, 2020 • Edited on Mar 22, 2021

JavaScript Type Checking... Without TypeScript

#javascript #typescript #tutorial #codequality

[NOTE: The concepts discussed in this article eventually evolved into a new approach with slightly different terminology. That approach now comprises a package that I call allow. You can find it here: https://www.npmjs.com/package/@toolz/allow]

There seems to be two crowds in the JavaScript community: those who use type-checking, and those who don't. If you read that last sentence as "...those who use TypeScript, and those who don't" you can be forgiven for reading a little more into the text than what was actually written. Because, far too often, projects that don't use TypeScript have an appalling lack of type-checking in place.

That's why I wrote this concise little utility that you can find here:

https://github.com/bytebodger/type-checking

Go ahead and pop on over there when you have a chance. It's only one file (is.js). It's all of 84 LoC. But I use this little utility on an incredibly frequent basis.

[Disclaimer: As you can imagine, with only 84 LoC, I'm not implying, in any way, that my silly little utility is any kind of replacement for TypeScript. If you want/need true type checking in your project, by all means, please reach for TypeScript (or Flow). This is just a helpful utility for those times when you're working inside a project that does not - or cannot - use TypeScript.]

The Problem

Nearly all of the programs that we write aren't actually singular, standalone programs. Instead, our programs consist of dozens/hundreds/thousands of miniature programs which, in aggregate, make up our application. You know what I'm talking about. These smaller component programs are known as functions.

Each function is a (hopefully) tiny program in its own right. It accepts an arbitrary list of zero-to-many inputs. It returns a single output - or it generates no output at all. Everything that happens inside that function operates as its own little program.

Now I'm a big believer that type mutability in dynamically-typed languages is a feature, not a "bug". If you want absolute certainty about all the types of all your variables at all times, then you shouldn't be programming in JavaScript in the first place. There are statically-typed languages that are there for the taking. And I can sometimes get kinda annoyed by the people who want to do everything they can to make JavaScript look/feel/act like C#.

But there's one area where I strongly believe that type certainty isn't a preference or a nice-to-have. This area is in the interface between functions. It's a must, if we're to write solid, robust, bug-free functions. In other words, it's nearly impossible to assure that our "mini-programs" (functions) will operate properly if we have no idea what type of arguments are being passed into them.

The Problem, Illustrated

const updateUser = (userId, name, age, currentEmployee, children) => {
   // the update logic...
   return updateResult;
};

Despite the simple nature of this function, there's really a lot that's potentially going on here. To update the user, we're accepting five separate arguments. Without taking the time to delve through any logic that might be inside the function, there are numerous questions that crop up:

Is userId supposed to be an integer? Or are we using some kind of alphanumeric (GUID) identifier, in which case this would be a string?
I assume that name should be a string, although it's not out-of-the-question to believe that the function expects name to be an object. Perhaps one that is formatted like so: {first:'Tom', middle:'Tim', last:'Tum'}. Or maybe an array, like: ['Tom','Tim','Tum'].
I assume that age should be an integer, but will it accept decimals? Will it accept 0?
Maybe currentEmployee is supposed to be a Boolean? Or maybe it's a string that contains the name of the user's employee? There's no way to know for certain.
Children "feels" like it should be an array - but again, there's no way to know that from the function signature.

So here we have two potential headaches:

There's little-to-no self-documentation going on in this function, so anyone invoking it has to either A. burn precious time reading through the entire function code to know exactly what's expected for each argument, or B. make a best-guess based upon the names of the arguments themselves.

And...

It's extremely difficult to write a robust function that will accept any kind of input for any of these five arguments without throwing an error or returning an aberrant value. What happens if I pass in an object for userId? Or an array for age? Will the code fail gracefully?

(A Little) Help With Default Values

We can make this somewhat cleaner and easier to understand if we add default values to our arguments, like so:

const updateUser = (userId = 0, name = '', age = 0, currentEmployee = false, children = []) => {
   // the update logic...
   return updateResult;
};

This definitely helps the casual developer to quickly grasp the types of values that should be passed into this function. We no longer have to guess about things like integer-vs-GUID userIds.

But this does almost nothing to ensure the proper execution of the function itself. That's because default values will only dictate the data type when no value is supplied. If the caller does, in fact, provide a value for the argument, the supplied value is used, regardless of whatever data type is implied by the default values.

To put this in practical terms, the default argument values don't stop us from doing this:

const updateUser = (userId = 0, name = '', age = 0, currentEmployee = false, children = []) => {
   // the update logic...
   return updateResult;
};

updateUser('007', {first:'Joe', last:'Blow'}, 'not saying', ['sure'], false);

In this case, we've made a real mess of the function invocation by chunking in a whole bunch of mismatched data types that our function probably wasn't expecting. It doesn't matter that the default values implied certain data types. Since we actually supplied our own data, JavaScript allowed us to pass in any data type we chose.

Here's another way that we can potentially screw up this function:

const updateUser = (userId = 0, name = '', age = 0, currentEmployee = false, children = []) => {
   // the update logic...
   return updateResult;
};

updateUser(0, '', 0);

Technically, we provided the function with the correct data types that are implied in the default values. But even though we accurately supplied integer \ string \ integer, there's a very good chance that this function invocation could fail or spawn some other kind of bug. Because, while 0, '', 0 definitely satisfies the "spirit" of the function call, there's a very good chance that 0 is an invalid integer to use for userId, that '' (empty string) is an invalid value to use for name, and that 0 is an invalid value to use for age.

So unless the logic inside the function is robust, this might spawn some kind of error or bug - even though we technically passed the proper data types into the function call.

At this point, I can almost hear some of you thinking:

None of this matters, because I'd never call my own function with the wrong types of data/values.

And that's great. I'm glad that your own coding is perfect and flawless. But once you've committed/merged the code for your function, you never technically know who's going to write new code (or alter existing code) to call that function. In other words, once you put your function out there, into the wild, it has to stand on its own. It needs to be as robust, bug-free, and foolproof as possible.

The proper execution of your function should never be dependent upon the idea that the caller will invoke it in the "proper" way.

If there is any "downside" to functional programming, it's that you, as the function's writer, can control anything that happens inside the function. But you can't control how/when it's called.

This is why I believe that JavaScript's dynamic typing is only a critical problem at the entrypoint to functions. Because most functions depend upon the data being presented in a certain format, and of a certain type.

Sure... it's possible to write all the logic inside the function that you need to handle any-and-all types of inputs, but that can be overly laborious and bloat our otherwise sleek-and-efficient functions.

One Potential Solution

As stated above in the disclaimer, the full/official/accepted way to address this is to use a heavy-duty strongly-typed system, like TypeScript or Flow. But that's not always an option. Sometimes you may not want to go to that extreme. Other times, you simply may not have the option to add something like TypeScript to a legacy project.

So are you stuck writing brittle functions? Or writing bloated functions that painstakingly try to account for every possible input? Hardly. The repo that I linked to at the top of this article shows my homegrown solution.

It's really just a single file. A class that I export and use as is. I chose this name because it's very short, and it maps to the values that I expect back from all of is's functions. You see, every validation in the file returns a Boolean. Every function checks to see whether a value conforms to a certain type.

In practical application, it looks like this:

import is from './is';

const updateUser = (userId = 0, name = '', age = 0, currentEmployee = false, children = []) => {
   if (!is.aPositiveInteger(userId) || !is.aPopulatedString(name) || !is.aPositiveInteger(age) || !is.aBoolean(currentEmployee) || !is.anArray(children))
      return;
   // the update logic...
   return updateResult;
};

Key Points:

If this looks a little wordy, please keep in mind that most functions have only one-or-two arguments. The only reason this looks longer is because there are five separate arguments to check.
In the example above, I'm just bailing out of the function if any of the checks fails with a simple return;. Obviously, depending upon the logic in your function, you may choose to follow a failed check with something like return false; or return null;.
I try to make the checks as specific as possible to the data that's needed inside the function. For example, I don't do if (!is.anInteger(userId))... because userId should really be a positive integer, and we don't want to have a value like 0 or -482 passed in. For the name value, we only want a populated (non-empty) string. It's not enough just to ensure that the supplied value is a string - because the empty string is still, technically, a string. But the empty string is not a valid value. But we're more lenient with regard to children. Because it's perfectly fine for children to consist of nothing but an empty array.
Whenever one of these checks fails, it will throw a console.error() message for you to see in the dev tools.
Notice that an argument's default value, combined with the is.() check on the next line, tells us whether the argument is truly required. We are supplying a default value for userId of 0. But the is() check ensures that the value is greater than zero. This means, functionally speaking, that it's required for the caller to supply a userId value. But children is not required. It has a default value of [] and the is() check only ensures that the value is, indeed, an array. So the function can be called without supplying any value for children.
There's certainly room to expand the list of validation in is.js. For example, a function could be created to ensure that a value is an array of strings, or an array of integers, or an array of objects. Of course, the more time you spend building out the validations in is.js, the more you have to ask yourself whether you should just be using a robust tool - like TypeScript. So don't go too overboard with this.

Implementation

It's fairly self-explanatory. But here are two tips that I use whenever deploying this in a non-TypeScript project:

Every argument, in every function, comes with a default value assigned.
The first line inside every function (that accepts arguments) consists of the is() checks needed to ensure that the supplied values conform with their expected data type.

That's it. No other "rules" to abide by. I hope this approach helps someone else as well.

Top comments (6)

Pacharapol Withayasakpunt • Jul 27 '20 • Edited

The real problem is not TypeScript is not type-safe, but many validation libraries was not made with IDE friendliness in mind.

TypeScript is as type-safe as IDE can be (might be in actuality lower than that), but we can always raise the bar.

I currently use JSON-schema-based jsonschema-definer. (Used to use zod.) JSON schema definitions are quite vast and well written, as well as work well with Swagger. JSON schema as well as Swagger are supposed to be cross language as well. The library also extended it with .custom() function as in case.

John Carroll • Nov 27 '20 • Edited

Interesting read. It seems like you, very intentionally, created a small (and, I imagine, performant), tool for some quick type checking. I'm guessing this is all you need. But! In case you aren't aware (because I wasn't), there's a validation/type-checking pattern that I call the "decoder" pattern (I'm not sure what the "official" name is) that I've found to be super (!) useful because of its composability.

For example:

import { assert } from 'ts-decoders';
import { objectD, stringD, predicateD, arrayD, nullableD, numberD  } from 'ts-decoders/decoders';

const nonBlankStringD = stringD().chain(
  predicateD(input => input.length > 0, { errorMsg: "cannot be blank" })
);

const myParamValidator = assert(
  objectD({
    id: nonBlankStringD,
    values: arrayD( nullableD(numberD()) ),
  }),
);

// Awesome, this passes
myParamValidator({ id: "apple", values: [1, null] });

// Oops! Throws error with message:
// "must be an object"
myParamValidator("a string value");

// Oops! Throws error with message:
// "invalid value for key [\"id\"] > cannot be blank"
myParamValidator({ id: "", values: [] });

// Oops! Throws error with message:
// "invalid value for key [\"values\"] > invalid value for key [2] > must be a number or null"
myParamValidator({ id: "apple", values: [1, null, "3"] });

// Oops! Throws error:
// "cannot be blank"
assert(nonBlankStringD)("");

This example uses the library I made for this purpose, ts-decoders, but more popular options are libraries like yup or io-ts.

The decoder pattern does add a bit of complexity though, so I can understand if it's not appropriate for your use case. The decoder composability is very nice though, so figured I'd mention it in a conversation about type-checking.

Adam Nathaniel Davis • Nov 27 '20

I knew that there were libraries/packages out there, although I can't claim to have seen this particular pattern. Looks pretty solid! I'd probably be most interested in yup since I'm not really a fan of TS.

Thanks for cluing me into this!

easrng • Apr 20 '20

I just made a library, inspired by this post, that ensures the type of your defaults matches what you call it with. For your example, it looks like

import ct from "https://raw.githack.com/easrng/ct/master/ct.esm.js";
const updateUser = ct((userId = 0, name = '', age = 0, currentEmployee = false, children = []) => {
   // the update logic...
   return updateResult;
});

The code is at github.com/easrng/ct. Any feedback?

Adam Nathaniel Davis • Apr 20 '20 • Edited

First, the phrase "I just made a library, inspired by this post" is always music to my ears. So kudos to you!

As for the specifics of what you've written - I like it! Now, in case anyone else is reading this comment and thinking, "Oh cool - this completely replaces what Adam did in the original post," I'd say: not quite. But that's not a knock on what you've written. Again - I like it!

The difference between your approach and mine is that yours is using the type inference supplied in the default values to ensure that any values passed in match those types. And that's hella-cool. I could absolutely see using your approach in those instances where the default values are sufficient to constrain the function parameters.

But part of my inspiration for writing my little utility class was that, in many cases, I don't find the default values to be sufficient to define what is "acceptable" input. For example, imagine a function like this:

const updateFirstName = (firstName = '') => {
  // do the updating logic here...
}

In this scenario, the function requires a string. And your approach is awesome for ensuring that the firstName value is a string. The only "problem" this leads to is that an empty string is still, technically, a string.

Now it's perfectly feasible that, in some scenarios, it might be acceptable to know that any string was passed in. Maybe the logic for your app is totally cool with the idea that, on occasion, the function might get called with an empty string for the firstName value. But usually, when I'm writing a function like this, or when I'm looking at a function that's already been written like this, it's not really acceptable to think that the firstName value might be the empty string.

You might be thinking, "Well, this is my app, and I know all the places where updateFirstName() is getting called. And I know that, on the calling end, I would never pass in an empty string for firstName." And that's... OK. It's definitely not wrong.

But one of my tenets for writing "the tightest code possible" is that you can never fully guarantee just how a function will be called. Maybe someone else will start working in your codebase and carelessly write some code that will pass an empty string in for firstName? Maybe you haven't 100% accounted for all the places where your own code is calling updateFirstName()? Whatever the scenario, I believe firmly that the proper performance of a function should never be dependent upon the idea that the function is called in the "right" way.

Because every function is, in a quite-literal sense, its own standalone program. And every program should be as fault-tolerant as possible.

So in the scenario above, using my little utility, I would code it up like this:

const updateFirstName = (firstName = '') => {
  if (!is.aPopulatedString(firstName))
    return;
  // do the updating logic here...
}

Notice that I'm not just checking to ensure that the value is a string. I'm specifically ensuring that it's a populated string, cuz there's a much greater likelihood that an empty string is "wrong" and a populated string is "acceptable".

In my utility, I have a bunch of potential checks that look for things like:

Is this a populated string/array/object?
Is this a non-negative integer?
Is this a GUID?

None of this as meant as any kinda criticism of what you've written. It's cool!!! I'm just pointing out where your utility would shine, and where you might still need to reach for mine.

Thank you for this utility and for the feedback!!

easrng • Apr 20 '20

Thanks for the feedback! I made this mostly because it will make adding the checks to my code a lot easier, because I'm lazy and usually don't do any checks at all. My utility simply removes many of the checks that feel redundant, and I recognize that it is not a replacement for yours.