Tomasz Wegrzanowski

Posted on Feb 4, 2022

100 Languages Speedrun: Episode 79: Designing New Esoteric Language Tuples

#ruby #esoteric #programming

I have an idea for a new esoteric language, but before I bother with such messy things as parsers and exact details of how it would work, I just want to validate the concept.

And for that I'll be using the best tool - Ruby. Ruby makes it really easy to create Domain Specific Languages (DSLs), and that makes it also great for creating language prototypes.

I've never seen a language like the one I want to try before, but with so many esoteric languages being created all the time, it's entirely possible that someone already make something similar. Especially since it's such a simple concept.

Tuples

So the idea is this. The program is a bunch of rules for transforming tuples of integers into other tuples of integers, like this:

-> (1, 1, 1)
(a, b, c) -> (b, a+b, c+1)

Program state is a set of such tuples. As long as it can, it will keep generating more tuples, ignoring any duplicates. Program ends once no new tuple can be generated. There's no execution order, as result will be the same no matter how we do it.

Initial state can be empty, as -> rule with with no dependencies will just always work and add that tuple to the set.

This example program by the way, will generate all Fibonacci numbers without ever ending.

When program starts, the first rule will apply adding this tuple:

(1, 1, 1)

And then second rule will apply forever, adding new tuples to the set:

(1, 2, 2)
(2, 3, 3)
(3, 5, 4)
(5, 8, 5)
(8, 13, 6)
(13, 21, 7)
etc.

Now to make this idea useful, we want some kind of I/O, and some way of stopping.

I/O

For I/O we could designate some special kind of tuples like (0, 1, index, value) that index-th character of output is value.

We could also designate (0, 0, index, value) as index-th character of input is value, but for convenience I'll flag programs that expect input.

Now that's a lot messier part, but I think it would be more practical if we added some way to indicate end of input, so let's use (0, 0, index, 0) for that, with index just after the last character.

I might need to change that 0 to some other number, but we get into messy issues of Unicode and bytes and so on, so let's keep it simple for now. It's all Unicode characters with UTF-8 encoding.

One important part is that once the program ends, we collect every (0, 1, index, value) tuple that exists, and we skip over any gaps.

Control flow

Now for how the program stops, and for all the if/else and looping needs. I think the easiest rule is that tuples can only have zero or positive integers. Any rule output with negative numbers will not be added to the state.

For now let's only have rules with 0 or 1 dependency. I'll discuss this limitation later.

Hello, World!

Here's the program in possible Tuples syntax:

-> (0, 1, 0, 72)
-> (0, 1, 1, 101)
-> (0, 1, 2, 108)
-> (0, 1, 3, 108)
-> (0, 1, 4, 111)
-> (0, 1, 5, 44)
-> (0, 1, 6, 32)
-> (0, 1, 7, 87)
-> (0, 1, 8, 111)
-> (0, 1, 9, 114)
-> (0, 1, 10, 108)
-> (0, 1, 11, 100)
-> (0, 1, 12, 33)
-> (0, 1, 13, 10)

And here's Ruby DSL I used:

#!/usr/bin/env ruby

require_relative "tuples"

Tuples.run do
  tuple 0, 1, 0, 72
  tuple 0, 1, 1, 101
  tuple 0, 1, 2, 108
  tuple 0, 1, 3, 108
  tuple 0, 1, 4, 111
  tuple 0, 1, 5, 44
  tuple 0, 1, 6, 32
  tuple 0, 1, 7, 87
  tuple 0, 1, 8, 111
  tuple 0, 1, 9, 114
  tuple 0, 1, 10, 108
  tuple 0, 1, 11, 100
  tuple 0, 1, 12, 33
  tuple 0, 1, 13, 10
end

Super simple.

$ ./hello.rb
Hello, World!

Hello, Name!

Let's try processing some input:

$ ./hello_name.rb
Alice
Hello, Alice!

For this we need to filter two things out of the input, as it will be "Alice\n\x00".

As the filter we can use b-11, which is negative for b <= 10. So 0 and 10 bytes won't get passed through, while everything else will.

Output is:

0-6 "Hello, "
100-199 name from user input
200-201 "!\n"

The program will not like it if your name is too long.

-> (0, 1, 0, 72)
-> (0, 1, 1, 101)
-> (0, 1, 2, 108)
-> (0, 1, 3, 108)
-> (0, 1, 4, 111)
-> (0, 1, 5, 44)
-> (0, 1, 6, 32)
(0, 0, a, b) -> (2, b-11, 100+a, b)
(2, a, b, c) -> (0, 1, b, c)
-> (0, 1, 200, 33)
-> (0, 1, 201, 10)

#!/usr/bin/env ruby

require_relative "tuples"

Tuples.run(input: true) do
  tuple 0, 1, 0, 72
  tuple 0, 1, 1, 101
  tuple 0, 1, 2, 108
  tuple 0, 1, 3, 108
  tuple 0, 1, 4, 111
  tuple 0, 1, 5, 44
  tuple 0, 1, 6, 32
  rule(0, 0, :a, :b) { [2, b-11, 100+a, b] }
  rule(2, :a, :b, :c) { [0, 1, b, c] }
  tuple 0, 1, 200, 33
  tuple 0, 1, 201, 10
end

Loop

$ ./loop.rb
1
2
3
4
5
6
7
8
9
10

Generating numbers 1 to 10 is really easy. It turns out converting numbers to string is also very easy, but doing it naively will get extra zeroes, so we need extra step to filter out the zeroes.

-> (1, 1, 9)
(1, a, b) -> (1, a+1, b-1)
(1, a, b) -> (2, (a/10)-1, 10*a+0, 48+(a/10))
(1, a, b) -> (2, 0, 10*a+1, 48+(a%10))
(1, a, b) -> (2, 0, 10*a+2, 10)
(2, a, b, c) -> (0, 1, b, c)

#!/usr/bin/env ruby

require_relative "tuples"

Tuples.run do
  # Initial state
  tuple 1, 1, 9
  # Generate data
  rule(1, :a, :b) { [1, a+1, b-1]}

  # Generate output, number to string conversion
  # But send it to extra stage so we can filter the parts we don't want
  # if second arg is negative, it won't get added to the state
  rule(1, :a, :b) { [2, (a/10)-1, 10*a+0, 48+(a/10)] }
  rule(1, :a, :b) { [2, 0, 10*a+1, 48+(a%10)] }
  rule(1, :a, :b) { [2, 0, 10*a+2, 10] }

  # Filtering is done, output the result
  rule(2, :a, :b, :c) { [0, 1, b, c] }
end

Fibonacci

$ ./fib.rb
fib(1)=1
fib(2)=1
fib(3)=2
fib(4)=3
fib(5)=5
fib(6)=8
fib(7)=13
fib(8)=21
fib(9)=34
fib(10)=55
...
fib(99)=218922995834555169026
fib(100)=354224848179261915075

The program to generate Fibonacci sequence is just starting state, and initial rule, but we'd need to add extra number to count down to zero.

Interestingly it's really easy to write general purpose string to number converter in just three rules. That's the 11, * and 12, * rules. We pass the index of the last character to output to that function, and it will go backwards as far as it needs to. Again, if it's more than 100 digits, it will mess up the result.

_ means nothing special, it's just a variable name.

-> (10, 1, 1, 1, 99)

(10, i, a, b, cnt) -> (10, i+1, b, a+b, cnt-1)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 0, 102)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 1, 105)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 2, 98)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 3, 40)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 200, 41)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 201, 61)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 401, 10)
(10, i, a, b, cnt) -> (11, 1000*i + 199, i)
(10, i, a, b, cnt) -> (11, 1000*i + 399, a)

(11, i, n) -> (0, 1, i, 48+(n%10))
(11, i, n) -> (12, (n/10)-1, i-1, n/10)
(12, _, i, n) -> (11, i, n)

#!/usr/bin/env ruby

require_relative "tuples"

Tuples.run do
  # Initial state
  tuple 10, 1, 1, 1, 99
  # Generate data
  rule(10, :i, :a, :b, :cnt) { [10, i+1, b, a+b, cnt-1]}
  # print "fib(",  ")=", and "\n" parts
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 0, 102] }
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 1, 105] }
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 2, 98] }
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 3, 40] }
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 200, 41] }
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 201, 61] }
  rule(10, :i, :a, :b, :cnt) { [0, 1, 1000*i + 401, 10] }

  # Send i and fib(i) to conversion function to output at the right places
  rule(10, :i, :a, :b, :cnt) { [11, 1000*i + 199, i] }
  rule(10, :i, :a, :b, :cnt) { [11, 1000*i + 399, a] }

  # Convert string to number and output it
  rule(11, :i, :n) { [0, 1, i, 48+(n%10)] }
  rule(11, :i, :n) { [12, (n/10)-1, i-1, n/10] }
  rule(12, :_, :i, :n) { [11, i, n] }
end

Output map for line 7 is:

7000-7003 "fib("
7100-7199 converter value of i
7200-7201 ")="
7300-7399 converter value of fib(i)
7400 "\n"

FizzBuzz

Most of the program is very simple.

Tuples 1.* are the data. 11.* and 12.* are number to string conversion. 20.*, 21.*, 22.*, and 23.* are various conversion functions.

The only complicated part is logic of which printing function to choose. As the only thing Tuples can do is a >= 0, it would be simple enough to check do FizzBuzz (-(i%15)) and number branches ((i%3)*(i%5)-1), but it's not so easy to figure out formula to do Fizz and Buzz. So I make print functions accept two conditional arguments instead.

-> (1, 1, 99)

(1, a, b) -> (1, a+1, b-1)

(1, i, _) -> (0, 1, 1000*i + 100, 10)

(1, i, a) -> (20, (i%3)-1, (i%5)-1, i)
(1, i, a) -> (21, -(i%3), (i%5)-1, i)
(1, i, a) -> (22, (i%3)-1, -(i%5), i)
(1, i, a) -> (23, -(i%3), -(i%5), i)

(20, a, b, i) -> (11, 1000*i + 99, i)

(21, a, b, i) -> (0, 1, 1000*i + 0, 70)
(21, a, b, i) -> (0, 1, 1000*i + 1, 105)
(21, a, b, i) -> (0, 1, 1000*i + 2, 122)
(21, a, b, i) -> (0, 1, 1000*i + 3, 122)

(22, a, b, i) -> (0, 1, 1000*i + 0, 66)
(22, a, b, i) -> (0, 1, 1000*i + 1, 117)
(22, a, b, i) -> (0, 1, 1000*i + 2, 122)
(22, a, b, i) -> (0, 1, 1000*i + 3, 122)

(23, a, b, i) -> (0, 1, 1000*i + 0, 70)
(23, a, b, i) -> (0, 1, 1000*i + 1, 105)
(23, a, b, i) -> (0, 1, 1000*i + 2, 122)
(23, a, b, i) -> (0, 1, 1000*i + 3, 122)
(23, a, b, i) -> (0, 1, 1000*i + 5, 66)
(23, a, b, i) -> (0, 1, 1000*i + 6, 117)
(23, a, b, i) -> (0, 1, 1000*i + 7, 122)
(23, a, b, i) -> (0, 1, 1000*i + 8, 122)

(11, i, n) -> (0, 1, i, 48+(n%10))
(11, i, n) -> (12, (n/10)-1, i-1, n/10)
(12, _, i, n) -> (11, i, n)

#!/usr/bin/env ruby

require_relative "tuples"

Tuples.run do
  # Initial state
  tuple 1, 1, 99
  # Generate data
  rule(1, :a, :b) { [1, a+1, b-1]}

  # Print all "\n"s
  rule(1, :i, :_) { [0, 1, 1000*i + 100, 10] }

  # Decide which print function to call
  rule(1, :i, :a) { [20, (i%3)-1, (i%5)-1, i] }
  rule(1, :i, :a) { [21, -(i%3), (i%5)-1, i] }
  rule(1, :i, :a) { [22, (i%3)-1, -(i%5), i] }
  rule(1, :i, :a) { [23, -(i%3), -(i%5), i] }

  # Print number
  rule(20, :a, :b, :i) { [11, 1000*i + 99, i] }

  # Print Fizz
  rule(21, :a, :b, :i) { [0, 1, 1000*i + 0, 70] }
  rule(21, :a, :b, :i) { [0, 1, 1000*i + 1, 105] }
  rule(21, :a, :b, :i) { [0, 1, 1000*i + 2, 122] }
  rule(21, :a, :b, :i) { [0, 1, 1000*i + 3, 122] }

  # Print Buzz
  rule(22, :a, :b, :i) { [0, 1, 1000*i + 0, 66] }
  rule(22, :a, :b, :i) { [0, 1, 1000*i + 1, 117] }
  rule(22, :a, :b, :i) { [0, 1, 1000*i + 2, 122] }
  rule(22, :a, :b, :i) { [0, 1, 1000*i + 3, 122] }

  # Print FizzBuzz
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 0, 70] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 1, 105] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 2, 122] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 3, 122] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 5, 66] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 6, 117] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 7, 122] }
  rule(23, :a, :b, :i) { [0, 1, 1000*i + 8, 122] }

  # Convert string to number and output it
  rule(11, :i, :n) { [0, 1, i, 48+(n%10)] }
  rule(11, :i, :n) { [12, (n/10)-1, i-1, n/10] }
  rule(12, :_, :i, :n) { [11, i, n] }
end

Cat

This program just copies whatever it gets to the output.

$ echo -e "Hello\nWorld" | ./cat.rb
Hello
World

We need one rule to do the copying, and one to know when to stop (when we get a 0).

(0, 0, i, c) -> (1, c-1, i, c)
(1, _, i, c) -> (0, 1, i, c)

#!/usr/bin/env ruby

require_relative "tuples"

Tuples.run(input: true) do
  rule(0, 0, :i, :c) { [1, c-1, i, c] }
  rule(1, :_, :i, :c) { [0, 1, i, c] }
end

Turing Completeness

The language is trivially Turing complete, and we can prove it by making it run Turing machines.

If machine is in state A with current cell value B, tape to the left [L0, L1, L2...], and tape to the right [R0, R1, R2, ...], and number of states N, then we can do state transitions while staying in place:

(A, B, L, R) -> (newA, newB, L, R)

Transition and moving left:

(A, B, L, R) -> (newA, L%N, L/N, (R*N)+newB)

Transitions and moving right:

(A, B, L, R) -> (newA, R%N, (L*N)+newB, R/N)

And any final state will just trigger some rule (finalState, B, L, R) -> ... while not having any regular transitions.

Of course my goal is not to create a "Turing Tarpit", my goal is to create a fun interesting language that's enjoyable to write nontrivial programs in.

Tuples Runner

Here's tuples.rb I wrote to run Tuples:

require "set"
require "pry"
require "ostruct"

class Tuples
  def initialize(&block)
    @states = Set[]
    @queue = []
    @rules = []
    instance_eval(&block)
  end

  def self.run(**kwargs, &block)
    new(&block).run(**kwargs)
  end

  def tuple(*state)
    return if @states.include?(state)
    return if state.any?(&:negative?)
    @states << state
    @queue << state
  end

  def process(state)
    @rules.each do |pattern, block|
      process_rule state, pattern, block
    end
  end

  def process_rule(state, pattern, block)
    return unless pattern.size == state.size
    args = OpenStruct.new
    pattern.size.times do |i|
      if pattern[i].is_a?(Symbol)
        args[pattern[i]] = state[i]
      elsif pattern[i] != state[i]
        return
      end
    end
    tuple *args.instance_eval(&block)
  end

  def rule(*pattern, &block)
    @rules << [pattern, block]
  end

  def read_input
    input = STDIN.read.codepoints.each
    input.each_with_index do |b, i|
      tuple 0, 0, i, b
    end
    tuple 0, 0, input.size, 0
  end

  def run(input: false)
    read_input if input

    until @queue.empty?
      process @queue.pop
    end

    output = @states.select{|s| s.size == 4 and s[0,2] == [0,1]}.sort.map(&:last)
    print output.pack("U*")

    if output.empty? or ENV["DEBUG"]
      @states.sort.each do |state|
        p state
      end
    end
  end
end

I/O Limitations

There are some big limitations of this version, first of all the I/O. Right now it can only read all input at once at the start, and do all the output at once at the end.

This prevents writing interactive programs, but it even prevents just writing a program that print the same thing over and over.

One way to address it is to have some way to request data, like writing (0, 0, size) to mean "read input and populate (0, 0, i, c) until there's at least size". Doing it this way makes it independent of when different requests are made, only the biggest one matters.

As for intermixing input and output, an obvious way would be to immediately print all (0, 1, i, c) we get, if there are no gaps before them - but the gap skipping behavior is a big part of what makes the language so easy to use. The no-skip FizzBuzz would require a lot of spaghetti code doing everything linearly instead of data flowing smoothly from one part of the program to another.

We could do output requests just like input requests, so (0, 1, 1000) would means "print everything until 1000", I promise I won't add more. But that really doesn't work on its own, as Tuples by design doesn't specify any order of execution.

An interesting tweak of this idea is to always print everything before we ask for next character of input. And only actually get input if there are no more calculations to do.

The main loop would be something like this:

start with empty state
apply all rules as long as possible
print everything we have (either always whatever we have; or only if explicitly requested)
if no more rules apply, check if any request for additional input is in the state
if yes, get it, add to state and continue
if not, end the program

Multiple Inputs Limitations

A much bigger issue is that we have no way to combine two different tuples.

This makes a lot of I/O programs impossible. Even something as simple as program that reads two characters and says if they're same or not is impossible, as (0, 0, 0, a) (first character) and (0, 0, 1, b) (second character) can never end up in the same rule, no matter how much they bounce around.

We could fix it with I/O specific hack. So instead of (0, 0, i, a) encoding the first character, it could encode all the characters up to i in one big number. So a script that read 3 characters, would put these in the state (switched from Unicode to 8-bit, now you pretty much need to do your own UTF-8 decoding):

(0, 0, 0, a)
(0, 0, 1, a*256 + b)
(0, 0, 2, a*256*256 + b*256 + c)

This is not very elegant, but we could get quite far this way.

We can do byte decoding from that with (0, 0, i, b) -> (100, i, b%256). UTF-8 character decoder would be more rules, but I don't think it would be too complicated.

I think at some point we'd want to support rules with two dependencies, but I wonder how far we can get without doing so. There's never any need for 3+ dependency rules as A and B and C -> X is just A and B -> AB and AB and C -> X.

Fully general two-dependency rules would be very powerful and quite complicated to implement efficiently, so I wonder if there's some more limited idea that could be used instead.

Meta-programming Limitations

One thing I sort of wanted to do was to be able to do generic functions. Right now it's possible if we know how many items a tuple has.

Let's say we want to define 100.* as "if a == b then add state C". If we know that C has 4 elements, then we can do this (-(a-b)*(a-b) is zero if a==b, negative otherwise):

(100, a, b, c1, c2, c3, c4) -> (101, -(a-b)*(a-b), c1, c2, c3, c4)
(101, z, c1, c2, c3, c4) -> (c1, c2, c3, c4)

But we'd need to define 100.* for every possible tuple size.

This isn't really a huge limitation for small programs, as for any reasonable tuple size we can just copy and paste such rules, but it would be nice to be able to do something like this (with * or ... or such for "rest of arguments"):

(100, a, b, ...c) -> (101, -(a-b)*(a-b), ...c)
(101, z, ...c) -> (...c)

To be honest I don't think it would really add too much value, and it would increase complexity considerably.

Could Tuples be more restricted?

Instead of increasing the power, can we reduce it without making it too unbearable?

We could probably restrict what goes on the right side to a single operation per field. So instead of:

(10,a,b,c) -> (20,a+b*c)

You'd need to do:

(10,a,b,c) -> (11,a,b*c)
(11,a,bc) -> (20,a+bc)

One difficulty with that is that Tuples supports negative numbers as intermediate values, but not as part of the state, so naive translation won't always work, and we might need to do some extra work to avoid negative numbers.

Some operations like % are clearly unnecessary, as (a%b) is just a-(a/b)*b. I suspect we could replace multiply and divide with countdowns as well. I think we could get away with just constant, field+1, and field-1, but maybe there's something I didn't consider.

Getting rid of nonzero constant might be possible, but then we'd need to select kind of a tuple by its length, and that would be a huge hassle. Like (5, a, b) would need to be something like (0, 0, 0, 0, 0, a, b).

Any such changes would get Tuples closer and closer to "Turing Tarpit" territory, and wouldn't be very fun.

If we allow any expressions like now, the language would obviously still work with just 4 elements per tuple, as shown by the Turing Completeness example. By the same proof, it could still run Turing Machine with 3 by combining current symbol and state. Or even 2 by combining current symbol, current state, and one of the tape sides; the other symbol being the other tape side.

I don't really see how it would work as a real language with just 1, but it can do Collatz Conjecture:

(n) -> ((n%2) * (3*n+1) + (1-(n%2)) * n/2)

So maybe even 1-element Tuples is Turing complete in some crazy way.

Amazingly I think it could be Turing complete with just 2 number tuples, fixed initial state (0, 0), and 1 next rule with some ridiculous a % N sum.

Of course that would be another way to get into a Turing Tarpit, where everything is possible, and nothing is fun.

All my claims about Turing Completeness are based on briefly sketching a possible construction, and I never did proper math, so I could definitely be wrong.

Next steps

Once I figure out what kind of language I want, I'll need to do proper parser, but that would be premature now.

I'm really interested in trying out the following variants to see how they compare with the current variant as real esoteric language:

Tuples with interactive I/O
Tuples with 2-dependency rules

And these just for sake of some math fun:

Tuples with only const, +1, and -1
Tuples with just 2 numbers per tuple, fixed (0, 0) start state, and 1 rule - or whatever turns out to be the smallest Turing Complete version

By the way let me know if a similar language already exists. It's simple enough that it's entirely possible that someone might have come up with something really similar before.