I’ve always been somewhat interested in how programming languages are written. Last month I bought Writing An Interpreter In Go, and am about 40% of the way through that book (though my version is in C#). I’ve hand built a tokenizer and am starting to write my own parser.
It’s gotten me curious about programming languages, and I wanted to try my hand at building a brainfuck interpreter. Lets dive in.
Brainfuck is an esoteric programming langauge. It’s designed to be confusing, hence the name. The language only consists of 8 characters,
]. With an internal structure defined by wikipedia as:
The brainfuck language uses a simple machine model consisting of the program and instruction pointer, as well as an array of at least 30,000 byte cells initialized to zero; a movable data pointer (initialized to point to the leftmost byte of the array); and two streams of bytes for input and output (most often connected to a keyboard and a monitor respectively, and using the ASCII character encoding).
Each character in the language is defined as:
< : increment the data pointer (to point to the next cell to the right)
> : decrement the data pointer (to point to the next cell to the left)
+ : increment (increase by one) the byte at the data pointer
- : decrement (decrease by one) the byte at the data pointer.
. : output the byte at the data pointer
, : accept on byte of input, storing its value in the byte at the data pointer
[ : if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command.
] : if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command.
Here is a program that prints out “Hello World”:
If this is your first esoteric programming language, this definitely throws you for a loop. Since this article is about documenting how to build an interpreter, here’s a great video describing brainfuck to anyone to wants a more visual representation.
So lets see some basic code examples.
If I wanted to write a single exclamation point, here is a program that would do that:
It uses the
+ character to add 1 to the current data pointer 33 times, 33 being the ASCII representation for an exclamation point. Then uses the
. character to print that single character.
Here are three different programs that print out
The first version generates and prints the exclamation pointer as before, then moves to the next cell and increments data pointer to 63, the ASCII representation for ?, and then prints it.
The second version reuses the cell of the exclamation mark, so it only needs to increment that cell 30 more times. 33 (!) + 30 = 63 (?)
The third version uses the
] brackets to repeat sections of code resulting in smaller programs.
I’ve named my interpreter Dizzy, because programming in brainfuck can make you dizzy.
Well that aside, there’s not a whole lot of complexity to the interpreter.
Lets start off by creating a class called
Interpreter that will be interpreting our program.
It has three instance variables: 1. A byte array to hold any data we might need, 2. A pointer telling us where we are in that array, 3. The input converted into a char array.
Then lets create a scaffold of our Run method, which is a switch statement for each of the 8 valid characters in our language.
In brainfuck, any other characters are just ignored, so anything not in this set of 8 will just be skipped over.
Given our definition above, for
<, we need to increment and decrement the pointer.
-, we need to increment and decrement the current value in our tape referenced by the pointer.
Then our IO characters are also pretty simple:
Finally we need to implement our brackets. Brackets help us repeat or skip sections of a brainfuck program, kind of like a goto statement. That’s why the third program for writing !? was so much shorter, brackets allowed the same section code to be run multiple times.
The only complication here is nested brackets. For a program like
[+[+]+]>+, which skips from the first open bracket to the last close bracket, we need to keep track which open/close brackets belong together.
We introduce an int called
unmatchedBracketCounter, (defined outside the for loop) which tracks how many pairs of brackets we’ve encountered. If we’re looking for a
] and encounter another
[, we know we will have to now pass two
] before we can stop.
Combining all these elements we end up with a single class that can interpret a brainfuck program.
Now for testing. As someone who codes for a living, I feel very comfortable writing programs in normal programming languages. Brainfuck programs are another matter.
Fortunately I found this great website which has brainfuck programs, can run brainfuck programs, and can generate brainfuck programs as well.
Here are some fun programs that I found on that website.
This calculates Pi to the 15 decimal place. When run the output is
3.14070455282885, which the first 15 digits of PI are
3.14159265358979, which I am using an 8 bit cell, so it seems like I’m experiencing some overflows as called out in the program.
When run this programs outputs all squares up to 10000, 0, 1, 4, 9 16, …, 9801, 1000.
When run this prints out the 99 bottles of beer chanty, from 99 all the way to 0.
And finally a program that takes advantage of user input. Most brainfuck programs that I found didn’t use the input character ,, I wrote one with the help of the text generator mentioned earlier.
There you have it. A interpreter that will run brainfuck programs.
I definitely enjoyed writing this interpreter. The first one that I’ve written from scratch. It was a fun and pretty different programming challenge from what I’m used to.
I hope you enjoyed it and continue to experiment with things that you’re interested in.
Originally published at thesharperdev.com.