## DEV Community is a community of 866,220 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Danie Palm

Posted on • Updated on

# Church encoding in the concatenative language Joy

I've previously written about Church encoding in the Joy programming language. The article focused on booleans, logic and predicates, but didn't touch on numbers. Church numerals have been haunting me ever since, and after stumbling on two other concatenative languages, Dawn and Stck, things have finally fallen in place for me (avoiding an obvious pun here) and I've been able to simplify the Church boolean scheme as well. Instead of
updating my previous article, which would be a rewrite anyway, I'm doing a new article on Church encoding in Joy.

This article is technically part of the series Elixir of Life, but is intended to be self-contained.

## Joy

Feel free to skip this intro to Joy if you've read any of my other Joy articles.

Joy is a functional, stack-based concatenative programming language developed by Manfred von Thun. It has many similarities to lambda calculus, combinatory logic, and Lisp. Like combinatory logic it differs from lambda calculus in that it doesn't feature lambda abstraction, the way in which formal parameters are specified in lambda calculus, and therefore does not have the concept of free variables. Joy differs from combinatory logic in that it is not based on function application, but rather on function composition. All joy functions, often called combinators, take a stack as its sole input and returns an updated stack as result. Joy uses postfix notation, also often called reverse Polish notation.

Joy is a fully-fledged programming language packed with all kinds of convenience functions that facilitate everyday program flow, such as conditional execution, repeated execution, recursion, and IO. It also comes with a rich set of types. But in this article, we begin to show how almost all of that can be thrown away and rebuilt from only a handful of functions and quotations. A quotations is essentially a program literal that takes the form of a list of functions that remain unevaluated unless explicitly executed. The following set of combinators, which can actually be reduced even more, is enough to render Joy Turing-complete:

``````dup  - duplicates the top of the stack
pop  - removes the top of the stack
swap - swaps the two tops elements of the stack
i    - executes a quotation (removes wrapping []'s)
unit - quotes the top of the stack (wraps it in []'s)
cons - prepends the element below the top of the stack to a quotation on top
cat  - concatenates the two top elements (quotations) of the stack
``````

Here is a simple Joy program: `[] [dup] [pop] swap i i`, it evaluates to `[]`. Terms are executed from left to right. We first place `[]`, `[dup]`, and `[pop]` on the stack. Then `swap` reads `[pop]` and `[dup]` from the stack, switches them around and puts them back on the stack. Next `i` unquotes `[dup]` which was on top of the stack, and thus causes it to be executed, duplicating `[pop]` which was right below it on the stack. Finally, the next `i` unquotes `[pop]` which then removes its duplicate that was below it on the stack. This leaves only `[]` on the stack.

## Church encoding

Church encoding is a scheme by which any calculable operator and its data (operands) can be expressed purely in terms of functions. It has traditionally been explored mostly in the context of lambda calculus, but works well in concatenative languages, and - I imagine - Lisp, and combinatory logic. In fact, it works in
any Turing-complete language.

In simple terms, you only need a single primitive: the function. All other primitives (numbers, booleans, pairs, lists) can be encoded in terms of functions.

For example, in Joy (remember it uses postfix notation)it is possible to express the predicate `1 + 2 = 3` as follows:

``````1 2 + 3 =
``````

where each of `1`, `2`, `+`, `3`, and `=` is a composite function that is defined in terms of a handful of primitive functions such as `dup`, `pop`, `swap`, `cat`, `unit`, and `cons`. When executed, it leaves the value `true` on the stack (since the sum of 1 and 2 does in fact equal 3), where `true` is also a composite function defined in terms of a handful of primitives.

According to our scheme, the definition of `true` happens to be:

`true == [pop]`

But there are numerous different schemes, each with its own pros and `cons` (not avoiding an obvious pun here). There is no canonical representation of any literal, and its semantics as an operator (say `+`) or operand (say `1`) arises from the scheme as a whole and how it maps onto logic and arithmetic.

In this article we employ a scheme in which literals are represented as quotations - function compositions that remain unevaluated until explicitly evaluated.

## Church booleans

Booleans can be represented as functions that expect two values on the stack, returning the one argument if true and the other if false. In other words, we can write reduction rules for `true` and `false` as follows:

``````A B true i == A
``````

and

``````A B false i == B
``````

These reduction rules lead to the following definitions:

``````A B true i == A
A B true i == A B pop
A B true i == A B [pop] i
true   ==     [pop]
``````
``````A B false i == B
A B false i == B A pop
A B false i == A B swap pop
A B false i == A B [swap pop] i
false   ==     [swap pop]
``````

Don't stress if you don't quite yet follow the derivation process, we'll get back to it in more detail.

Note that we could easily reverse the definitions of `true` and `false`. As long as we construct the boolean operators correctly, the definitions of `true` and `false` are largely arbitrary.

While it is convenient to think of `A` and `B` as arguments of the functions `true` and `false`, they are in fact two elements on top of a stack that is the only argument of the functions.

Let's see `true` in action:

``````A B true i
A B [pop] i   # definition of true
A B pop       # i unquotes the element on top of the stack
A             # pop discards the top of the stack
``````

Admittedly, there is nothing "boolean" here yet. The boolean nature of `true` and `false` will only appear once we introduce some operators. Note the need for `i` (apply) after the boolean function. It is required because `true` and `false` are quotations and therefore "inert" unless applied.

The first operator that we'll consider is `and`.

### The `and` operator

The semantics we expect from `and` is as follows:

``````false false and == false
false true and == false
true false and == false
true true and == true
``````

And we achieve that with the following definition of `and`:

``````and == dup i
``````

It is easy to verify that the above definition works as expected, but it
is not immediately clear how we arrived at the solution. Let's start with
some reduction rules and work our way down to a definition. In case `B` is
true, the reduction rule is:

``````A B and == A
``````

And in case `B` is false, the rule is:

``````A B and == B
``````

In order to derive a definition for `and` we want to rewrite the reduction rules in the form:

``````A B and == A B C
``````

where `C` is the definition of `and`. We do this by systematically introducing function compositions to the right hand side that are equivalent to the identity function `id`. This is similar to how introducing `+ 0` or `+ x - x` to to a mathematical expression has no effect on its value.

We start with the case where `B` is true.

``````A B and == A                 # if B is true, only A counts
== A B pop           # since B pop == id
== A B [pop] i       # i unquotes [pop]
== A B true i        # definition of true
== A B B i           # since B is true
``````

And in case `B` is false:

``````A B and == B                 # if B is false, we can ignore A
== B A pop           # since A pop is a no-op
== A B swap pop      # A B swap == B A
== A B [swap pop] i  # [swap pop] i == swap pop
== A B false i       # definition of false
== A B B i           # since B is false
``````

And so we arrive at a common reduction rule that is independent of the value of B:

``````A B and == A B B i
== A B dup i
``````

And therefore:

``````and == dup i
``````

### The `or` operator

We expect the following semantics from an `or` operator:

``````false false or == false
false true or == true
true false or == true
true true or == true
``````

Using the same deductive process as before, we arrive at the following
definition:

``````or == [dup] swap unit cat i swap i
== [dup] dip swap i      # defining dip == swap unit cat i
``````

At first glance this is quite a bit more involved than the `and` operator,
but we'll get to a more elegant definition below.

### The `not` operator

The `not` operator should simply negate the boolean that is on top of the
stack:

``````false not == true
true not == false
``````

Let us once again work our way down from reduction rules to a definition. If `A` is true, then:

``````A not == false
== false true pop     # true pop == id
== false true [pop] i # quote then unquote
== false true true i  # definition of true
== false true A i     # since A is true
``````

And similarly, when `A` is false:

``````A not == true
== true false pop
== false true swap pop
== false true [swap pop] i
== false true false i
== false true A i  # since A is false
``````

Now, regardless of the value of `A`:

``````A not == false true A i
A not == A [false true] dip i
not == [false true] dip i
``````

Armed with `not`, the definition of `or` simplifies to:

``````or == dup not i
``````

We can also define `nand == and not` and `nor == or not`. Incidentally, `nand`
and `nor` are the only two functionally complete operators, meaning that reduction rules for any other boolean operator can be written purely in terms of either `nand` or `nor`. Let's briefly look at how `nand` can be implemented using only `nor`:

``````A B nand == A A nor B B nor nor A A nor B B nor nor nor
``````

And, for symmetry, how `nor` can be implemented using only `nand`:

``````A B nor == A A nand B B nand nand A A nand B B nand nand nand
``````

The reader is invited to check the veracity of these reduction rules and to derive definitions for `nand` and `nor` from them.

### Conditionals

So far we have seen how choosing `true == [pop]` and `false == [swap pop]` has
allowed us to devise definitions for all the usual boolean operators. Our choice, which is not the only possible choice, is based on the idea of selectively discarding functions. I.e., pop the top of the stack, or pop the value below the top of the stack.

It is not surprizing then that we can also define functions that behave like
conditionals or if-then-else blocks. We have essentially already come across a primitive if-then-else block in the definition of `not`.

We can take this one step further and define `branch` with the following reduction rule:

``````cond [then] [else] branch == then # if cond is true
cond [then] [else] branch == else # if cond is false

cond [then] [else] branch == [then] i # if cond is true
cond [then] [else] branch == [else] i # if cond is false

cond [then] [else] branch == [then] i # if cond is true
cond [then] [else] branch == [else] i # if cond is false

cond [then] [else] branch == [then] [else] pop i # if cond is true
cond [then] [else] branch == [then] [else] swap pop i # if cond is false

cond [then] [else] branch == [then] [else] [pop] i i # if cond is true
cond [then] [else] branch == [then] [else] [swap pop] i i # if cond is false

cond [then] [else] branch == [then] [else] true i i # if cond is true
cond [then] [else] branch == [then] [else] false i i # if cond is false

cond [then] [else] branch == [then] [else] cond i i # if cond is true
cond [then] [else] branch == [then] [else] cond i i # if cond is false

# from this point onward the reduction rule is valid for any value of cond
cond [then] [else] branch == [then] [else] cond i i
cond [then] [else] branch == cond [then] [else] dig2 i i
cond [then] [else] branch == cond [then] [else] unit cons dip i i
branch == unit cons dip i i
``````

We now have a simple conditional. If the condition is true we execute a then-block, otherwise we execute an else-block. It would be more useful if we also had a conditional block, so to speak. I.e. a quotation that, when executed, yields a boolean. This is where comparators like `=`, `<`, and `>` would be useful.

In order to arrive at definitions for such comparators, we need to look at Church numbers first.

## Church numerals

Where Church booleans were centred around optional execution, Church numbers are centred around multiple execution.

A quick note on terminology. I will use "numeral" and "number" interchangeably, but they are not the same thing. "Numeral" is syntax, "number" is semantics. Both the Arabic numeral "3" and the Roman numeral "III" can have the meaning of the number three in the right context.

The semantics we are looking for is this:

``````[A] 0 i ==
[A] 1 i == A
[A] 2 i == A A
[A] 3 i == A A A
.
.
.
[A] N i == A A A ... A
``````

A definition for `0` that satisfies the above reduction rule is `0 == [pop]`. Similarly, `1 == [i]` will do as a definition for `1`. For `2`, things are slightly more complicated:

``````[A] 2 i == A A
== [A A] i
== [A] [A] cat i
== [A] dup cat i
== [A] [dup cat i] i
2 == [dup cat i]
``````

We are hoping for some kind of pattern to emerge, so that we don't need a new definition for every
numeral. Let's take a look at `3`:

``````[A] 3 i == A A A
== [A A A] cat i
== [A A] [A] cat i
== [A] [A] [A] cat cat i
== [A] [A] dup cat cat i
== [A] dup dup cat cat i
== [A] [dup] 2 i [cat] 2 i i
== [A] [[dup] 2 i [cat] 2 i i] i
3 == [[dup] 2 i [cat] 2 i i]
``````

That gives us a definition of `3` in terms of its predecessor `2`. We could expand the definition
all the way and replace `2` with `[dup cat i]`, but we won't gain much in doing so. Let's rather find out whether the pattern continues. I.e. can we define `4` in terms of its predecessor `3`? It turns out that we can - the reader is invited to find a derivation - and it also turns out that the pattern holds for all `n` where `n > 0`.

We now have a way to apply a quotation `n` times, but that in itself doesn't really mean that `2` has the semantics of the number two. Not without operators.

As a stepping stone to addition, the first operator that we'll consider is `succ`, the successor function.

### The successor function `succ`

We expect the following semantics from a successor function:

``````0 succ == 1
1 succ == 2
2 succ == 3
.
.
.
``````

In order to derive a definition for `succ`, we will exploit the fact that numbers are defined in
terms of their predecessors. That is, if `N` is our starting point, then by carefully positioning and manipulating `N` to arrive at `[[dup] N i [cat] N i i]`, we will have found a definition for `succ`:

``````N succ == [[dup] N i [cat] N i i]
.
.
.
succ == unit [i] cat [[dup]] swap dup [[cat]] swap [i] cat cat cat cat
``````

Addition is the repetition of succession. Adding m to n is equivalent to calculating the m*th successor of *n. This leads us to the following reduction rule:

``````N M + == N succ succ ... succ         # succ repeated M times
N M + == N [succ] M i
N M + == N M [succ] swap i
+ == [succ] swap i
``````

For example:

``````2 3 +
2 3 [succ] swap i
2 [succ] 3 i
2 succ succ succ
3 succ succ
4 succ
5
``````

### Multiplication

Multiplication can be defined in terms of addition. The product of n and m is equivalent to adding n m times to 0. That is, 2 times 3 is 0 + 2 + 2 + 2. We can generalize multiplication with the following reduction rule:

``````N M × == 0 N + N + ... N +      # "N +" repeated M times
N M × == 0 [N +] M i
N M × == [N +] M 0 bury2 i
N M × == N [+] cons M 0 bury2 i
N M × == N M [[+] cons] dip 0 bury2 i
× == [[+] cons] dip 0 bury2 i
``````

For example:

``````2 3 ×
2 3 [[+] cons] dip 0 bury2 i
2 [+] cons 3 0 bury2 i
[2 +] 3 0 bury2 i
0 [2 +] 3 i
0 2 + 2 + 2 +
2 2 + 2 +
4 2 +
6
``````

### Exponentiation

The derivation of the `pow` function is very similar to that of `×`. However, instead of repetitively adding a value to 0, we now rather multiply 1 repetitively by some value. That is 2^3 is equivalent to 1 times 2 times 2 times 2. Or, in general:

``````N M pow == 1 N × N × ... N ×       # N × repeated M times
N M pow == 1 [N ×] M i
.
.
.
pow == [[×] cons] dip 1 bury2 i
``````

For example:

``````2 3 pow
2 3 [[×] cons] dip 1 bury2 i
2 [×] cons 3 1 bury2 i
[2 ×] 3 1 bury2 i
1 [2 ×] 3 i
1 2 × 2 × 2 ×
2 2 × 2 ×
4 2 ×
8
``````

### Predecessor function

So far, we've only dealt with functions that increase the value of a number. To get to subtraction, division, and roots, we need to start with
a predecessor function `pred`. Here is what we expect from a predecessor function:

``````0 pred == 0       # I'll explain this later
1 pred == 0
2 pred == 1
3 pred == 2
.
.
.
``````

Defining `pred` using Church encoding is quite complicated. Here is how we will go about it. We want to start at 0 and repetitively add 1 to it, but instead of repeating this n times, we'll repeat it n - 1 times. In a sense, that just defers the problem, because now we have to figure out when to stop. The secret lies in applying some
function that will act like `succ` n times, except for the very first application. This we can achieve by maintaining a boolean along with our numerical value that keeps track of whether we are at the first function application or not. This leads us to the following reduction rule:

``````N pred == 0 false [[succ true] [true] branch] N i
``````

From which we can derive a definition:

``````pred == 0 false dig2 [[succ true] [true] branch] swap i pop
``````

To illustrate how it works, let's consider `2 pred`:

``````2 pred
2 0 false dig2 [[succ true] [true] branch] swap i pop
0 false 2 [[succ true] [true] branch] swap i pop
0 false [[succ true] [true] branch] 2 i pop
0 false [succ true] [true] branch [succ true] [true] branch pop
0 true [succ true] [true] branch pop
0 succ true pop
1 true pop
1
``````

Note that the predecessor of 0 is 0. The encoding we are dealing with here only applies to the natural numbers, so
no negatives. Extending the encoding to integers and reals and even complex numbers is possible, but not within

### Subtraction

Equipped with `pred`, we can move on to subtraction (`-`):

``````N M - == N [pred] M i
``````

And thus:

``````- == [pred] swap i
``````

## Church comparators and predicates

So far we have looked at Church booleans and Church numbers. We even had to combine them to arrive at a definition for `pred`. We now
turn to predicates. The first predicate we look at is `is-zero`. In order to arrive at a definition for `is-zero`, we will exploit the
fact that the numeral `0` can be used to apply some quoted function zero times, whereas any other numeral n, will apply the quoted
function n times. If we choose the quotation to be `[pop false]` and start with a value of `true`, then if we apply the function zero times,
we end up with `true`, which is what we want. If on the other hand we apply the function once or more, we will pop the initial `true` value
and replace it with `false`, only to pop `false` and replace it with `false` again, repeat n times. This leads us to the following reduction rules:

``````0 is-zero == true
1 is-zero == true pop false
2 is-zero == true pop false pop false
3 is-zero == true pop false pop false pop false
.
.
.
N is-zero == true [pop false] N i
``````

And thus:

``````is-zero == [true [pop false]] dip i
``````

Also, since n - m = 0 when m is larger than or equal to n, we can also make the following definitions:

``````<= == - is-zero
``````

and

``````>= == swap - is-zero
``````

Finally, if n <= m and n >= m, then we know that n = m:

``````m n = == m n <= m n >= and
m n = == [m n] i <= [m n] i >= and
m n = == [m n] [m n] [i <=] dip i >= and
m n = == [m n] [m n] [i <=] swap unit cat i i >= and
m n = == [m n] dup [i <=] swap unit cat i i >= and
m n = == m [n] cons dup [i <=] swap unit cat i i >= and
m n = == m n unit cons dup [i <=] swap unit cat i i >= and
= ==     unit cons dup [i <=] dip i >= and
``````

## Advanced numerical operators

We have now built up a sizeable arsenal of composite operators and operands. It is time to address division and root extraction.

### Division

The basic principle that we'll follow is that the quotient is the number of times that you can subtract the divisor from the dividend before you reach 0.
In other words `10 2 ÷ 5 =` means that `10 [2 -] 5 i 0 =` (subtracting 2 five times from 10 yields 0). In other words, in order to determine `N M ÷` we want to start with a quotient of 1 (`Q`) and increment its value iteratively until the following no longer holds:

``````N [M -] Q i 0 >=
``````

One way to achieve this is to use the `y` combinator, which makes anonymous recursion possible. In the base case, when our condition no longer holds, we terminate and only leave the quotient on the stack. In the recursive case, we increment the quotient and recheck the condition. The resulting reduction rule for division is:

``````N M ÷ == 0 true [
bury2
[
succ dup [N [M -]] dip i 0 >= dig2 i
]
[swap pop]
branch
] y
``````

The objective is to extract `N` and `M` to the front of the right hand side. The result is a definition for `÷`:

``````÷ == [
[[bury2] dip] dip
[
[[succ dup ] dip] dip [[-] cons] cons cons dip i 0 >= dig2 i
] cons cons
[swap pop]
branch
] cons cons [0 true] dip y
``````

### Root extraction

Determining the *n*th root is very similar to division. But instead of repetitive subtraction, we now employ repetitive division. This time we increment the root `R` until the following no longer holds:

``````N [M ÷] R i 1 >=
``````

The reduction rule for root extraction looks like this:

``````N M √ == 0 true [
bury2
[
succ dup [N [M ÷]] dip i 1 >= dig2 i
]
[swap pop]
branch
] y
``````

From which we can derive the definition of `√`:

``````√ == [
[[bury2] dip] dip
[
[[succ dup ] dip] dip [[÷] cons] cons cons dip i 1 >= dig2 i
] cons cons
[swap pop]
branch
] cons cons [0 true] dip y
``````

## Putting it all together

We started out with the ambition to devise an encoding using only basic functions by which the following would evaluate to true:

``````1 2 + 3 =
``````

Let's put it to the test (taking a few shortcuts here and there):

``````1 2 + 3 =
3 3 =
3 3 unit cons dup [i <=] dip i >= and      # definition of =
3 [3] cons dup [i <=] dip i >= and
[3 3] dup [i <=] dip i >= and
[3 3] [3 3] [i <=] dip i >= and
[3 3] i <= [3 3] i >= and
3 3 <= 3 3 >= and
3 3 - is-zero 3 3 swap - is-zero and
3 3 - is-zero 3 3 swap - is-zero and
0 is-zero 3 3 swap - is-zero and
0 is-zero 3 3 - is-zero and
0 is-zero 0 is-zero and
0 [true [pop false]] dip i 0 [true [pop false]] dip i and
true [pop false] 0 i true [pop false] 0 i and
true true and
true true [dup] dip swap i                # definition of and
true dup true swap i
true true true swap i
true true true i
true true [pop] i                         # definition of true
true true pop
true
``````

## Applications

Besides the ability to do simple arithmetic and logic, we can also put Church-encoding numerals to other use.

### Digging and burying

One such application lies in generalizing stack shuffling functions like `dig` and `bury`.

The `dig` family of functions are used to dig values up from deep down on the stack. `dig`, or `dig1` is equivalent to `swap`, whereas `dig2` brings the second value below the top to the top. And so on. Here are definitions for some `dig` functions:

``````dig1 == [] cons dip
dig2 == [] cons cons dip
dig3 == [] cons cons cons dip
``````

Clearly, the deeper you want to dig, the more `cons` functions are required. We can generalize `dig` as follows:

``````dign == [] [cons] N i dip
``````

The `bury` family does the opposite. It buries the top of the stack. Once again, `bury1` is equivalent to `swap`. Here are definitions for some `bury` functions:

``````bury1 == [[] cons] dip swap i
bury2 == [[] cons cons] dip swap i
bury3 == [[] cons cons cons] dip swap i
``````

And likewise, we can generalize this to:

``````buryn === [[] [cons] N i] dip swap i
``````

## Summary of definitions

Miscellaneous helpers:

``````dip   == swap unit cat i
y     == [dup cons] swap cat dup cons i
dign  == [] [cons] N i dip
buryn == [[] [cons] N i] dip swap i
``````

Boolean operators:

``````true   == [pop]
false  == [swap pop]
and    == dup i
or     == [dup] dip swap i
not    == [false true] dip i
nand   == and not
nor    == or not
``````

Numerical operators:

``````N       == [[dup] M i [cat] M i i]
succ    == unit [i] cat [[dup]] swap dup [[cat]] swap [i] cat cat cat cat
+       == [succ] swap i
×       == [[+] cons] dip 0 bury2 i
pow     == [[×] cons] dip 1 bury2 i
pred    == 0 false dig2 [[succ true] [true] branch] swap i pop
-       == [pred] swap i
is-zero == [true [pop false]] dip i
÷       == [[[bury2] dip] dip [[[succ dup ] dip] dip [[-] cons] cons cons dip i 0 >= dig2 i] cons cons [swap pop] branch] cons cons [0 true] dip y
√       == [[[bury2] dip] dip [[[succ dup ] dip] dip [[÷] cons] cons cons dip i 1 >= dig2 i] cons cons [swap pop] branch] cons cons [0 true] dip y
``````

where `N` is any natural number and `M` is the predecessor of `N`.

Conditionals and comparators:

``````branch == unit cons dip i i
<=     == - is-zero
>=     == swap - is-zero
=      == unit cons dup [i <=] dip i >= and
``````