DEV Community

loading...

Non-empty arrays and lists in ReScript

John Jackson
I’m probably riding my bicycle or coding.
・6 min read

Processing ordered data is one of the most common things we do while coding. JavaScript gives us the familiar array data type. ReScript uses arrays but also inherits a list type from OCaml, which has some similar qualities but with different tradeoffs. But what if you want some variable number of ordered items, either in an array or a list, but you need to have at least one item?

A practical use case

This came up while I was developing my custom template language. This language uses this syntax: match a, b, c with 1, 2, 3 (Which basically translates to if (a === 1 && b === 2 && c === 3).) The sequence after the match statement can have any number of items: a (1), a, b (2), a, b, c (3), and so on. But it has to have at least one item.

Originally, I compiled this into the abstract syntax tree (AST) using a regular list. But this had two major downsides:

  1. The compile step would only generate an AST if the list had at least one item. But functions later in the the render step, which processed the AST, would still have to handle theoretical cases where the lists were empty, which led to unnecessary checks and unused code. (Further reading: Parse, don’t validate.)
  2. If I made a mistake writing the compiler and forgot to check that there was at least one item, then the bug wouldn’t appear until runtime, and the error messages would be cryptic.

What if the ReScript compiler could guarantee that the data was never empty? Fortunately, it can!

Option 1: creating a NonEmpty list

ReScript doesn’t include a non-empty ordered type by default, but creating one is easy. The simplest version of this is a non-empty list.

First, it helps to understand how a list works. Even though it’s built into the compiler and has a special syntax, it’s really just a normal variant that’s defined roughly like this: type rec list<'a> = Empty | Item('a, list<'a>). You can define a custom list type like this and use it almost identically to the built-in list, as you can see for yourself here.

With this in mind, we can create a type that looks like a list but cannot be empty. It might look something like this:

type nonEmpty<'a> = NonEmpty('a, list<'a>)
Enter fullscreen mode Exit fullscreen mode

This NonEmpty constructor must have one item, and then the rest of the items are stored in a nested list (which may or may not be empty).

You may have noticed that a variant with one constructor is basically a tuple, and you aren’t wrong. We could easily use type nonEmpty<'a> = ('a, list<'a>). The difference is that tuples are structurally typed and variants are nominally typed. Because our variant will always require an explicit name, then that makes our code a bit safer and easier to debug for type errors.

Now, whenever you need a value with the nonEmpty type, it will be impossible to create it without any items!

The advantage to using the list type as the “tail” in our NonEmpty constructor is that lists are already variants themselves, so converting to a regular list is easy.

let toList = (NonEmpty(head, tail)) => list{head, ...tail}
Enter fullscreen mode Exit fullscreen mode

Side note: If you’re not familiar with ReScript-style pattern-matching, this function accepts one argument which is being de-structured. It’s a shorthand for this:

let toList = (x) => {
  let NonEmpty(head, tail) = x
  list{head, ...tail}
}
Enter fullscreen mode Exit fullscreen mode

You can also create non-empty versions of standard list functions easily.

let map = (NonEmpty(head, tail), f) =>
  NonEmpty(f(head), Belt.List.map(tail, f))
Enter fullscreen mode Exit fullscreen mode

All of this means that you can continue to use familiar functions like map or reduce without too much added complexity.

Option 2: creating a NonEmpty array, with two different methods

Although lists are usually a good fit for functional programming, arrays have other advantages. When compiling to JavaScript, they’re usually more performant and they are more easily representable when serialized to JSON. For these reasons, I ended up using arrays instead of lists for my project. We can still make a non-empty version of arrays, but it’s a bit different.

Option 2a: Using an abstract type

The simplest version of a non-empty array is to simply validate its size and cast it to an abstract type.

module NonEmpty: {
  type t<'a>
  let make: array<'a> => option<t<'a>>
  let toArray: t<'a> => array<'a>
} = {
  type t<'a> = array<'a>
  let make = a =>
    switch a {
    | [] => None
    | a => Some(a)
    }
  let toArray = a => a
}
Enter fullscreen mode Exit fullscreen mode

Internally, our t type is just a normal array. But to the outside code, it’s a distinct type. The only way to create it is with the NonEmpty.make function, which returns None if the array is empty.

For many cases, this will be good enough. But it does come with a few limitations due to the fact that it's a runtime check and not a compile-time check.

  1. Inside the NonEmpty module, the type system still doesn’t track the size of the arrays themselves. If we resize the arrays with functions like slice, then it’s our responsibility to validate the size of the result.
  2. If your code always produces non-empty arrays, then the None path from NonEmpty.make will never be necessary.
  3. The make function will always add a bit more overhead than our non-empty list version does.

Option 2b: Using a variant

We can create a variant just like in our list version.

type nonEmpty<'a> = NonEmpty('a, array<'a>)
Enter fullscreen mode Exit fullscreen mode

But because arrays themselves aren’t variant structures like lists are, this also comes with some limitations.

Converting to or from normal arrays isn’t as efficient. Functions like concat are less performant than their list counterparts. Some functions, like map, may be easy to apply to this type, but more complex operations will introduce more overhead, and you’ll need to get creative in how you apply them.

That being said, this was the type I ended up using in my project. The downsides weren’t relevant to my use case. This is partially because I was mostly using custom functions already, so I could optimize them myself.

Bonus: how to write recursive functions

ReScript, being a “functional” language, is geared towards using pure functions as opposed to loops. Most of the time, recursive functions are compiled into while loops in the JavaScript code anyway, bringing you the best of both worlds.

Consider this function that takes a list of strings and concatenates them:

let concatStrings = l => {
  let rec aux = (acc, l) =>
    switch l {
    | list{} => acc
    | list{head, ...tail} => aux(acc ++ head, tail)
    }
  aux("", l)
}
Enter fullscreen mode Exit fullscreen mode

How would we write the same thing but with non-empty lists? We need to adjust the recursive part (aux) to take a head argument.

type t<'a> = NonEmpty('a, list<'a>)

let concatStrings = (NonEmpty(head, tail)) => {
  let rec aux = (acc, head, l) => {
    let acc = acc ++ head
    switch l {
    | list{} => acc
    | list{head, ...tail} => aux(acc, head, tail)
    }
  }
  aux("", head, tail)
}
Enter fullscreen mode Exit fullscreen mode

Or you can use a non-empty array version.

open Belt // Opening Belt makes array access (a[i]) type-safe

type t<'a> = NonEmpty('a, array<'a>)

let concatStrings = (NonEmpty(head, tail)) => {
  let rec aux = (acc, head, index) => {
    let acc = acc ++ head
    switch tail[index] {
    | None => acc
    | Some(head) => aux(acc, head, succ(index))
    }
  }
  aux("", head, 0)
}
Enter fullscreen mode Exit fullscreen mode

You can compare each of these versions here alongside their compiled JavaScript. They are all very similar.

You can also forego the non-empty type entirely by adjusting these functions so they take two arguments, like this: let concatStrings = (head, tail) => ... That may often be a simpler solution in cases where you don’t need to pass around a special data structure.

Conclusion

The ReScript type system is advanced enough that many problems can be solved in the types themselves, rather than in runtime code. This follows the mantra of “make impossible states impossible to represent.” If you’re able to craft your types so that it’s impossible to represent errors, then that eliminates a whole category of bugs. It also makes your code more performant by minimizing unnecessary paths.

Discussion (0)

Forem Open with the Forem app