DEV Community

Danie Palm
Danie Palm

Posted on • Updated on

A Joy REPL - live exploring replicating programs

In previous posts we've developed a Joy parser and interpreter in Elixir. While it was possible to play with them in the interactive Elixir shell IEx, the results were always Elixir lists and function were represented with Elixir atoms. In this post we develop a Joy REPL (read-evaluate-print loop) so that we can LERP (live explore replicating programs).

You can follow along by installing the REPL (and parser and interpreter) from GitHub:

GitHub logo palm86 / joy

Minimal Joy parser and interpreter

If you have Elixir installed, you can simply clone the project and then run mix escript.build from within the project root. It will build the joy executable and place it in ~/bin which is usually automatically on your PATH. Now run joy and enjoy:

   _
  |_|___ _ _
  | | . | | |
 _| |___|_  |
|___|   |___|

Interactive Joy (0.1.0) - press Ctrl+C to exit

joy>

Enter fullscreen mode Exit fullscreen mode

Since the initial introduction of the Joy interpreter in a previous post, I've made some changes that allow for different backends to be used. You can now implement your own backend with your own set of functions. If that is something that interests you, please check out the project documentation for more details.

Biology

The goal of this series of posts is to develop an artificial life system in which as much as possible, and hopefully all, of the semantics of the system come from within. But we are not heading straight to the goal, there is too much fun to be had on the way. Here we take a bit of a detour and explore the famous and mind-boggling Y combinator.

Self-reference is the ability of an entity to mention itself, either directly or indirectly. Direct self-reference is a given in natural language and in many programming languages. The sentence "This sentence contains five words", exemplifies how effortlessly a natural language sentence can speak of itself directly. However, natural language is also capable of indirect self-reference:

"when preceded by its quotation contains sixteen words" when preceded by its quotation contains sixteen words.
Enter fullscreen mode Exit fullscreen mode

In mathematics and biochemistry, however, self-reference can only be achieved in an indirect way, making use of special encodings such as Gödel numbering or the genetic code. However, for the purposes of this post, we will require indirect self-reference in order to carry out some processes that do not require self-reference or encodings in biochemistry.

Replication (DNA to DNA), transcription (DNA to RNA), translation (RNA to protein) and even cell division are all processes that loop until a termination criterion is reached; much like biochemical while-loops. The termination criterion is usually an encounter with a termination subsequence, the end of a strand, or death itself.

The Y combinator provides us with a way to implement the above processes in Joy using indirect self-invocation or anonymous recursion. This is, of course, not to imply that real chemical processes make use of some form of chemical Y combinator in order to accomplish repetition or looping.

Code

We have briefly mentioned the Y combinator before. It is sometimes referred to as a "paradoxical" combinator, because Haskell Curry used it to prove that any logical claim can be proven in untyped lambda calculus (and a few other systems, including combinatory logic), which implies that untyped lambda calculus is not sound as a system for proving logical claims. The Y combinator can be defined in joy as y == [dup cons] swap cat dup cons i. If you want to follow along, run the following in your interactive Joy session:

joy> [y] [[dup cons] swap cat dup cons i] define
y == [dup cons] swap cat dup cons i
Enter fullscreen mode Exit fullscreen mode

The special function define takes two quotations as arguments. At the top of the stack it expects a quoted program (the function body) and below it a quoted single function (the function name).

The function y expects a program, say [p] on top of the stack and then yields [[dup cons p] dup cons p] p. But what exactly is going on here? Let's first look at what the quotation would yield when executed on its own. We define q == [dup cons p] dup cons p. Now:

q == [dup cons p] dup cons p           (definition)
q == [dup cons p] [dup cons p] cons p  (dup applied)
q == [[dup cons p] dup cons p] p       (cons applied)
q == [q] p                             (definition)
Enter fullscreen mode Exit fullscreen mode

This means that q is a fixed-point of p. I.e. it is a value such that applying p to it, leaves it unchanged. And why stop there? If q == [q] p then q == [[q] p] p and so on, for any p. One can begin to see how this can be useful to attain anonymous recursion.

You can read more about recursion in Joy in the Recursion Theory and Joy article by Manfred von Thun, from which the above derivation comes. Also check out this explanation of the Y combinator using lambda calculus.

Let's try it out in the REPL:

joy> [q] [[dup cons p] dup cons p] define
q == [dup cons p] dup cons p

joy> q
[[dup cons p] dup cons p] p
Enter fullscreen mode Exit fullscreen mode

At the moment, we haven't defined p, so it just puts itself on the stack. Let's look at some interesting definitions of [p], such as [] or [i]. First clear the stack with as many zaps as required. And then run:

joy> [] y
[[dup cons] dup cons]
Enter fullscreen mode Exit fullscreen mode

or

joy> [i] y
Killed
Enter fullscreen mode Exit fullscreen mode

The first one yields the familiar quine. The second one never terminates until the process is finally killed. This happens because i keeps on unquoting the [q] which evaluates to [[dup cons i] dup cons i] i which enters the loop again. The way to get to recursion that eventually terminates with a useful result, is to define p such that it only conditionally unquotes [q]. I.e. in the base case p should not invoke [q], and instead only do so in the recursive case.

We will take up this point again in a later post. Here is the source code for the REPL:

defmodule Joy.REPL do
  @splash """
     _
    |_|___ _ _
    | | . | | |
   _| |___|_  |
  |___|   |___|

  Interactive Joy (#{String.trim(File.read!("VERSION"))}) - press Ctrl+C to exit
  """

  def main(_args \\ []) do
    IO.puts(IO.ANSI.cyan() <> @splash <> IO.ANSI.white())

    loop([])
  end

  defp loop(stack) do
    # read
    input = IO.gets(IO.ANSI.magenta() <> "joy> " <> IO.ANSI.white())

    # evaluate
    stack =
      with {:ok, parsed_input} <- Joy.Parser.parse(input),
           {:ok, stack} <- Joy.Interpreter.interpret(parsed_input, stack) do
        # print
        stack
        |> Joy.Formatter.format(direction: :stack)
        |> IO.puts()

        stack
      else
        {:error, reason} ->
          # print
          IO.puts(IO.ANSI.red() <> "Error: #{inspect(reason)}" <> IO.ANSI.white())

          stack
      end

    # loop
    loop(stack)
  end
end
Enter fullscreen mode Exit fullscreen mode

The flow is relatively simple:

  1. Print the splash and start a loop
  2. Prompt the user for input from the standard input
  3. Parse the input and interpret it
  4. Print the result or error
  5. Re-enter the loop with updated state

In an initial attempt, I used a GenServer to keep the state of the REPL, but since we are already in a loop (and loops are what Elixir uses under the hood to keep state anyway), it makes much more sense to pass the state recursively to the loop function.

A last word on the special define function. This function is special because it lets you define custom functions. These custom functions certainly don't get compiled into the Joy.Interpreter implementation. So how does it work? The custom functions are defined as anonymous Elixir functions and get stored in a map. Now where do we store that map? We could define a GenServer to keep track of the map, or we could send the map along with the stack to every other function, but neither of these solutions strike me as ideal.

So instead the map of custom functions are stored inside the dictionary of the process in which it runs. Each process in Elixir has such a dictionary. This is how it works. The following function is automatically defined in every interpreter backend.

def define(stack) do
  [quotation, [name] | rest] = stack

  func = fn fn_stack ->
    fn_stack
    |> __execute(quotation)
  end

  custom_definitions =
    (Process.get()[:custom_definitions] || %{})
    |> Map.put(name, func)

  Process.put(:custom_definitions, custom_definitions)

  IO.puts(
    IO.ANSI.yellow() <>
      "#{name} == #{Joy.Formatter.format(quotation)}" <>
      IO.ANSI.white()
  )

  rest
end
Enter fullscreen mode Exit fullscreen mode

It pops the function name and implementation off the stack. Then creates an anonymous function that receives the stack as input,
executes the implementation on the stack, and returns the resulting new stack. The anonymous function is stored in the process dictionary under the key :custom_definitions, which has as value a map of custom anonymous functions, each keyed by the function name.

We also update the __execute function of the interpreter to check the process dictionary for functions if they are not already defined directly in the interpreter module:

def __execute(stack, program) when is_list(stack) and is_list(program) do
  Enum.reduce(program, stack, fn
    function, stack when is_atom(function) ->
      cond do
        Kernel.function_exported?(__MODULE__, function, 1) ->
          Kernel.apply(__MODULE__, function, [stack])

        # The relevant new addition
        function in Map.keys(Process.get()[:custom_definitions] || %{}) ->
          func = Process.get()[:custom_definitions][function]
          func.(stack)

        true ->
          [function | stack]
      end

    quotation, stack when is_list(quotation) ->
      [quotation | stack]
  end)
end
Enter fullscreen mode Exit fullscreen mode

In closing, the Joy library and REPL are not nearly polished, but already provides a nice environment for experimentation. I don't have any ambition to implement the entire set of standard libraries, and not all the features required for that are available yet. But expect more features and refinement.

Top comments (0)