Let's imagine a simple domain where the web application we are building provides articles to read.
We have "User" entities: the readers of the articles. They have a first and last name, and an email address. They might have a middle name initial.
While their verification is in progress, they should be limited in the number of articles they can read (let's say 3 articles max). Once verified, the limit should be lifted.
The data related to a "User" is provided by an external source, e.g. the API of a web service.
Initial type definition
Given the description of the "User" entity above, we can write the following type:
interface User {
firstName: string
lastName: string
emailAddress: string
middleNameInitial?: string
remainingReadings?: number
verifiedDate?: number
}
This type doesn't tell much about the constraints and logic of the domain. We can easily misunderstand some domain rules by only reading the types. For example, can we have both verifiedDate
and remainingReadings
defined at the same time? Or none of them? Can an "unverified user" have a verifiedDate
? It shouldn't, but the type doesn't prevent that: this interface allows illegal states in our software.
Moreover, we have no idea how many characters are allowed for the names, or if remainingReadings
can be a negative or floating number. We can guess by the name of the property, but guessing is not satisfying enough. We don't want to make asumptions there, we want precise answers given to us!
With a type like this one, if we want all the answers we need to look at the implementation, which might be scattered around in the code base and polluted by code unrelated to the domain.
Before jumping into writing a better type, allow me to talk about a couple of concepts.
What's an illegal state?
It's a state in the software that exists in theory because the static typing says so, but it can never exist in practice because of some runtime implementation.
For example, a typical type definition written for API responses might look like this:
interface ApiResponse {
data?: unknown
error?: string
}
The problem with this type is that it allows 2 states that will never exist at runtime:
// this compiles, but it shouldn't
const illegalState1: ApiResponse = {
data: { foo: 'foo' },
error: 'some error message'
}
// this also compiles, and it shouldn't
const illegalState2: ApiResponse = {}
We know that in the API response, we get either an object with a data
property, or an error
property. We can never get an empty object, or one with both properties defined. But we don't know that unless we take a look at the code handling API responses.
When we have codependent properties (e.g. data
is set when error
isn't, and vice-versa), a good practice is to use a sum type.
What's a sum type?
I am not going to explain what it is, mainly because other people have already done that before me (e.g. in this article by the creator of fp-ts
).
In TypeScript, a sum type can be written using a discriminated (or tagged) union type.
interface SuccessfulApiResponse {
type: 'SuccessfulApiResponse' // the "tag", or discriminant
data: unknown
}
interface FailedApiResponse {
type: 'FailedApiResponse'
error: string
}
// the sum type
type ApiResponse = SuccessfulApiResponse | FailedApiResponse
Here, by reading the types, we know that an API response can either succeed or fail. If it's successful then we have access to some data, otherwise we get an error message. We got rid of the illegal states: there is no way to have an empty object, or an object with both properties assigned to the ApiResponse
type.
The downside (in my opinion) of using a sum type in TypeScript is its implementation. Since TypeScript is a language that uses structural typing instead of nominal typing, we have to set the "tag" in the runtime object. We need a parser function that takes the original object as a parameter, then transforms it into the correct version of ApiResponse
by adding the type
property.
// use this e.g. in a middleware to parse every API response object
function parseApiResponse(res: { data: unknown } | { error: string }): ApiResponse {
return 'data' in res
? { ...res, type: 'SuccessfulApiResponse' }
: { ...res, type: 'FailedApiResponse' }
}
Then, when reading this ApiResponse
object, we have to check its type property to know if it's a successful or a failed one, and do something with the data it carries.
import { absurd } from 'fp-ts/function'
function handleResponse(res: ApiResponse): void {
switch (res.type) {
case 'SuccessfulApiResponse':
return console.log('Data', res.data)
case 'FailedApiResponse':
return console.log('Error message', res.error)
default:
// this function ensures exhaustiveness. If we forget a
// case, it won't compile anymore.
return absurd(res)
}
}
We can see 2 steps here:
- First we build the data objects of the sum type: we take the data and we pair it with some tag (the
type
in the example above). This is what we do in theparseApiResponse
function. We can also use constructor functions to build these objects. - Then at some point, we want to "extract" the data out of the sum type object. This is where we use some kind of pattern matching (cf. the
handleResponse
function).
For example, if we take the Either
type from fp-ts
, which is a sum type:
- The constructor functions are
right
andleft
. - We can use the
fold
function to do something depending on the actual type used at runtime (Left
orRight
).
These 2 steps add some significant boilerplate to the code base. The creator of fp-ts
made a small tool to generate this boilerplate code (and more) in TypeScript using Haskell-like syntax to define Algebraic Data Types (ADTs): fp-ts-codegen. We can also generate product types, which are basically records.
-- input
data ApiResponse A = SuccessfulApiResponse A | FailedApiResponse string
// output
export type ApiResponse<A> = {
readonly type: "SuccessfulApiResponse";
readonly value0: A;
} | {
readonly type: "FailedApiResponse";
readonly value0: string;
};
export function successfulApiResponse<A>(value0: A): ApiResponse<A> { return { type: "SuccessfulApiResponse", value0 }; }
export function failedApiResponse<A>(value0: string): ApiResponse<A> { return { type: "FailedApiResponse", value0 }; }
export function fold<A, R>(onSuccessfulApiResponse: (value0: A) => R, onFailedApiResponse: (value0: string) => R): (fa: ApiResponse<A>) => R { return fa => { switch (fa.type) {
case "SuccessfulApiResponse": return onSuccessfulApiResponse(fa.value0);
case "FailedApiResponse": return onFailedApiResponse(fa.value0);
} }; }
The output can be adapted manually, for example to rename properties. We can see 3 parts: type definition, constructors and "handler" (the fold
function).
Anyway, a single line in a language that supports sum types, such as Haskell and F#, gives a significant amount of lines in TypeScript, hence the "downside" I mentioned earlier. Nevertheless, I think sum types are very useful to remove illegal states from the code base. This allows us to write fewer unit tests, and should help us better understand what's going on with the data in the code base.
In the next articles of this series, we'll see how we can use a sum type to get rid of property combinations that are impossible (or illegal). In addition, we'll use smart constructors to build meaningful types out of primitive ones. These smart constructors are similar to the newtypes from Haskell, and single case union types from F#.
Top comments (0)